首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
When an amino-acid sequence cannot be optimized for both folding and function, folding can get compromised in favor of function. To understand this tradeoff better, we devise a novel method for extracting the “function-less” folding-motif of a protein fold from a set of structurally similar but functionally diverse proteins. We then obtain the β-trefoil folding-motif, and study its folding using structure-based models and molecular dynamics simulations. CompariA protein sequence serves two purpson with the folding of wild-type β-trefoil proteins shows that function affects folding in two ways: In the slower folding interleukin-1β, binding sites make the fold more complex, increase contact order and slow folding. In the faster folding hisactophilin, residues which could have been part of the folding-motif are used for function. This reduces the density of native contacts in functional regions and increases folding rate. The folding-motif helps identify subtle structural deviations which perturb folding. These may then be used for functional annotation. Further, the folding-motif could potentially be used as a first step in the sequence design of function-less scaffold proteins. Desired function can then be engineered into these scaffolds.  相似文献   

2.
Nitric oxide (NO) is an important signaling molecule that regulates many physiological processes in plants. One of the most important regulatory mechanisms of NO is S-nitrosylation—the covalent attachment of NO to cysteine residues. Although the involvement of cysteine S-nitrosylation in the regulation of protein functions is well established, its substrate specificity remains unknown. Identification of candidates for S-nitrosylation and their target cysteine residues is fundamental for studying the molecular mechanisms and regulatory roles of S-nitrosylation in plants. Several experimental methods that are based on the biotin switch have been developed to identify target proteins for S-nitrosylation. However, these methods have their limits. Thus, computational methods are attracting considerable attention for the identification of modification sites in proteins. Using GPS-SNO version 1.0, a recently developed S-nitrosylation site-prediction program, a set of 16,610 candidate proteins for S-nitrosylation containing 31,900 S-nitrosylation sites was isolated from the entire Arabidopsis proteome using the medium threshold. In the compartments “chloroplast,” “CUL4-RING ubiquitin ligase complex,” and “membrane” more than 70% of the proteins were identified as candidates for S-nitrosylation. The high number of identified candidates in the proteome reflects the importance of redox signaling in these compartments. An analysis of the functional distribution of the predicted candidates showed that proteins involved in signaling processes exhibited the highest prediction rate. In a set of 46 proteins, where 53 putative S-nitrosylation sites were already experimentally determined, the GPS-SNO program predicted 60 S-nitrosylation sites, but only 11 overlap with the results of the experimental approach. In general, a computer-assisted method for the prediction of targets for S-nitrosylation is a very good tool; however, further development, such as including the three dimensional structure of proteins in such analyses, would improve the identification of S-nitrosylation sites.  相似文献   

3.
Post-translational modifications are important functional determinants for intermediate filament (IF) proteins. Phosphorylation of IF proteins regulates filament organization, solubility, and cell-protective functions. Most known IF protein phosphorylation sites are serines localized in the variable “head” and “tail” domain regions. By contrast, little is known about site-specific tyrosine phosphorylation or its implications on IF protein function. We used available proteomic data from large scale studies to narrow down potential phospho-tyrosine sites on the simple epithelial IF protein keratin 8 (K8). Validation of the predicted sites using a pan-phosphotyrosine and a site-specific antibody, which we generated, revealed that the highly conserved Tyr-267 in the K8 “rod” domain was basally phosphorylated. The charge at this site was critically important, as demonstrated by altered filament organization of site-directed mutants, Y267F and Y267D, the latter exhibiting significantly diminished solubility. Pharmacological inhibition of the protein-tyrosine phosphatase PTP1B increased K8 Tyr-267 phosphorylation, decreased solubility, and increased K8 filament bundling, whereas PTP1B overexpression had the opposite effects. Furthermore, there was significant co-localization between K8 and a “substrate-trapping” mutant of PTP1B (D181A). Because K8 Tyr-267 is conserved in many IFs (QYE motif), we tested the effect of the paralogous Tyr in glial fibrillary acidic protein (GFAP), which is mutated in Alexander disease (Y242D). Similar to K8, Y242D GFAP exhibited highly irregular filament organization and diminished solubility. Our results implicate the rod domain QYE motif tyrosine as an important determinant of IF assembly and solubility properties that can be dynamically modulated by phosphorylation.  相似文献   

4.
HIV type 1 (HIV-1) is characterized by its rapid genetic evolution, leading to challenges in anti-HIV therapy. However, the sequence variations in HIV-1 proteins are not randomly distributed due to a combination of functional constraints and genetic drift. In this study, we examined patterns of sequence variability for evidence of linked sequence changes (termed as coevolution or covariation) in 15 HIV-1 proteins. It shows that the percentage of charged residues in the coevolving residues is significantly higher than that in all the HIV-1 proteins. Most of the coevolving residues are spatially proximal in the protein structures and tend to form relatively compact and independent units in the tertiary structures, termed as “protein sectors”. These protein sectors are closely associated with anti-HIV drug resistance, T cell epitopes, and antibody binding sites. Finally, we explored candidate peptide inhibitors based on the protein sectors. Our results can establish an association between the coevolving residues and molecular functions of HIV-1 proteins, and then provide us with valuable knowledge of pathology of HIV-1 and therapeutics development.  相似文献   

5.
The EGF-induced MAP kinase cascade is one of the most important and best characterized networks in intracellular signalling. It has a vital role in the development and maturation of living organisms. However, when deregulated, it is involved in the onset of a number of diseases. Based on a computational model describing a “surface” and an “internalized” parallel route, we use systems biology techniques to characterize aspects of the network’s functional organization. We examine the re-organization of protein groups from low to high external stimulation, define functional groups of proteins within the network, determine the parameter best encoding for input intensity and predict the effect of protein removal to the system’s output response. Extensive functional re-organization of proteins is observed in the lower end of stimulus concentrations. As we move to higher concentrations the variability is less pronounced. 6 functional groups have emerged from a consensus clustering approach, reflecting different dynamical aspects of the network. Mutual information investigation revealed that the maximum activation rate of the two output proteins best encodes for stimulus intensity. Removal of each protein of the network resulted in a range of graded effects, from complete silencing to intense activation. Our results provide a new “vista” of the EGF-induced MAP kinase cascade, from the perspective of complex self-organizing systems. Functional grouping of the proteins reveals an organizational scheme contrasting the current understanding of modular topology. The six identified groups may provide the means to experimentally follow the dynamics of this complex network. Also, the vulnerability analysis approach may be used for the development of novel therapeutic targets in the context of personalized medicine.  相似文献   

6.
Eukaryotic cells commonly use protein kinases in signaling systems that relay information and control a wide range of processes. These enzymes have a fundamentally similar structure, but achieve functional diversity through variable regions that determine how the catalytic core is activated and recruited to phosphorylation targets. “Hippo” pathways are ancient protein kinase signaling systems that control cell proliferation and morphogenesis; the NDR/LATS family protein kinases, which associate with “Mob” coactivator proteins, are central but incompletely understood components of these pathways. Here we describe the crystal structure of budding yeast Cbk1–Mob2, to our knowledge the first of an NDR/LATS kinase–Mob complex. It shows a novel coactivator-organized activation region that may be unique to NDR/LATS kinases, in which a key regulatory motif apparently shifts from an inactive binding mode to an active one upon phosphorylation. We also provide a structural basis for a substrate docking mechanism previously unknown in AGC family kinases, and show that docking interaction provides robustness to Cbk1’s regulation of its two known in vivo substrates. Co-evolution of docking motifs and phosphorylation consensus sites strongly indicates that a protein is an in vivo regulatory target of this hippo pathway, and predicts a new group of high-confidence Cbk1 substrates that function at sites of cytokinesis and cell growth. Moreover, docking peptides arise in unstructured regions of proteins that are probably already kinase substrates, suggesting a broad sequential model for adaptive acquisition of kinase docking in rapidly evolving intrinsically disordered polypeptides.  相似文献   

7.
A major challenge in designing proteins de novo to bind user-defined ligands with high affinity is finding backbones structures into which a new binding site geometry can be engineered with high precision. Recent advances in methods to generate protein fold families de novo have expanded the space of accessible protein structures, but it is not clear to what extend de novo proteins with diverse geometries also expand the space of designable ligand binding functions. We constructed a library of 25,806 high-quality ligand binding sites and developed a fast protocol to place (“match”) these binding sites into both naturally occurring and de novo protein families with two fold topologies: Rossman and NTF2. Each matching step involves engineering new binding site residues into each protein “scaffold”, which is distinct from the problem of comparing already existing binding pockets. 5,896 and 7,475 binding sites could be matched to the Rossmann and NTF2 fold families, respectively. De novo designed Rossman and NTF2 protein families can support 1,791 and 678 binding sites that cannot be matched to naturally existing structures with the same topologies, respectively. While the number of protein residues in ligand binding sites is the major determinant of matching success, ligand size and primary sequence separation of binding site residues also play important roles. The number of matched binding sites are power law functions of the number of members in a fold family. Our results suggest that de novo sampling of geometric variations on diverse fold topologies can significantly expand the space of designable ligand binding sites for a wealth of possible new protein functions.  相似文献   

8.
The residue composition of a ligand binding site determines the interactions available for diffusion-mediated ligand binding, and understanding general composition of these sites is of great importance if we are to gain insight into the functional diversity of the proteome. Many structure-based drug design methods utilize such heuristic information for improving prediction or characterization of ligand-binding sites in proteins of unknown function. The Binding MOAD database if one of the largest curated sets of protein-ligand complexes, and provides a source of diverse, high-quality data for establishing general trends of residue composition from currently available protein structures. We present an analysis of 3,295 non-redundant proteins with 9,114 non-redundant binding sites to identify residues over-represented in binding regions versus the rest of the protein surface. The Binding MOAD database delineates biologically-relevant “valid” ligands from “invalid” small-molecule ligands bound to the protein. Invalids are present in the crystallization medium and serve no known biological function. Contacts are found to differ between these classes of ligands, indicating that residue composition of biologically relevant binding sites is distinct not only from the rest of the protein surface, but also from surface regions capable of opportunistic binding of non-functional small molecules. To confirm these trends, we perform a rigorous analysis of the variation of residue propensity with respect to the size of the dataset and the content bias inherent in structure sets obtained from a large protein structure database. The optimal size of the dataset for establishing general trends of residue propensities, as well as strategies for assessing the significance of such trends, are suggested for future studies of binding-site composition.  相似文献   

9.
The amino acid sequences of proteins determine their three-dimensional structures and functions. However, how sequence information is related to structures and functions is still enigmatic. In this study, we show that at least a part of the sequence information can be extracted by treating amino acid sequences of proteins as a collection of English words, based on a working hypothesis that amino acid sequences of proteins are composed of short constituent amino acid sequences (SCSs) or “words”. We first confirmed that the English language highly likely follows Zipf''s law, a special case of power law. We found that the rank-frequency plot of SCSs in proteins exhibits a similar distribution when low-rank tails are excluded. In comparison with natural English and “compressed” English without spaces between words, amino acid sequences of proteins show larger linear ranges and smaller exponents with heavier low-rank tails, demonstrating that the SCS distribution in proteins is largely scale-free. A distribution pattern of SCSs in proteins is similar among species, but species-specific features are also present. Based on the availability scores of SCSs, we found that sequence motifs are enriched in high-availability sites (i.e., “key words”) and vice versa. In fact, the highest availability peak within a given protein sequence often directly corresponds to a sequence motif. The amino acid composition of high-availability sites within motifs is different from that of entire motifs and all protein sequences, suggesting the possible functional importance of specific SCSs and their compositional amino acids within motifs. We anticipate that our availability-based word decoding approach is complementary to sequence alignment approaches in predicting functionally important sites of unknown proteins from their amino acid sequences.  相似文献   

10.
Adaptor protein complex 2 α and β-appendage domains act as hubs for the assembly of accessory protein networks involved in clathrin-coated vesicle formation. We identify a large repertoire of β-appendage interactors by mass spectrometry. These interact with two distinct ligand interaction sites on the β-appendage (the “top” and “side” sites) that bind motifs distinct from those previously identified on the α-appendage. We solved the structure of the β-appendage with a peptide from the accessory protein Eps15 bound to the side site and with a peptide from the accessory cargo adaptor β-arrestin bound to the top site. We show that accessory proteins can bind simultaneously to multiple appendages, allowing these to cooperate in enhancing ligand avidities that appear to be irreversible in vitro. We now propose that clathrin, which interacts with the β-appendage, achieves ligand displacement in vivo by self-polymerisation as the coated pit matures. This changes the interaction environment from liquid-phase, affinity-driven interactions, to interactions driven by solid-phase stability (“matricity”). Accessory proteins that interact solely with the appendages are thereby displaced to areas of the coated pit where clathrin has not yet polymerised. However, proteins such as β-arrestin (non-visual arrestin) and autosomal recessive hypercholesterolemia protein, which have direct clathrin interactions, will remain in the coated pits with their interacting receptors.  相似文献   

11.
The Drosophila and plant (maize) functional counterparts of the abundant vertebrate chromosomal protein HMGB1 (HMG-D and ZmHMGB1, respectively) differ from HMGB1 in having a single HMG box, as well as basic and acidic flanking regions that vary greatly in length and charge. We show that despite these variations, HMG-D and ZmHMGB1 exist in dynamic assemblies in which the basic HMG boxes and linkers associate with their intrinsically disordered, predominantly acidic, tails in a manner analogous to that observed previously for HMGB1. The DNA-binding surfaces of the boxes and linkers are occluded in “auto-inhibited” forms of the protein, which are in equilibrium with transient, more open structures that are “binding-competent.” This strongly suggests that the mechanism of auto-inhibition may be a general one. HMG-D and ZmHMGB1 differ from HMGB1 in having phosphorylation sites in their tail and linker regions. In both cases, in vitro phosphorylation of serine residues within the acidic tail stabilizes the assembled form, suggesting another level of regulation for interaction with DNA, chromatin, and other proteins that is not possible for the uniformly acidic (hence unphosphorylatable) tail of HMGB1.  相似文献   

12.
Protein–protein interactions are challenging targets for modulation by small molecules. Here, we propose an approach that harnesses the increasing structural coverage of protein complexes to identify small molecules that may target protein interactions. Specifically, we identify ligand and protein binding sites that overlap upon alignment of homologous proteins. Of the 2,619 protein structure families observed to bind proteins, 1,028 also bind small molecules (250–1000 Da), and 197 exhibit a statistically significant (p<0.01) overlap between ligand and protein binding positions. These “bi-functional positions”, which bind both ligands and proteins, are particularly enriched in tyrosine and tryptophan residues, similar to “energetic hotspots” described previously, and are significantly less conserved than mono-functional and solvent exposed positions. Homology transfer identifies ligands whose binding sites overlap at least 20% of the protein interface for 35% of domain–domain and 45% of domain–peptide mediated interactions. The analysis recovered known small-molecule modulators of protein interactions as well as predicted new interaction targets based on the sequence similarity of ligand binding sites. We illustrate the predictive utility of the method by suggesting structural mechanisms for the effects of sanglifehrin A on HIV virion production, bepridil on the cellular entry of anthrax edema factor, and fusicoccin on vertebrate developmental pathways. The results, available at http://pibase.janelia.org, represent a comprehensive collection of structurally characterized modulators of protein interactions, and suggest that homologous structures are a useful resource for the rational design of interaction modulators.  相似文献   

13.
It has been a long-standing goal in systems biology to find relations between the topological properties and functional features of protein networks. However, most of the focus in network studies has been on highly connected proteins (“hubs”). As a complementary notion, it is possible to define bottlenecks as proteins with a high betweenness centrality (i.e., network nodes that have many “shortest paths” going through them, analogous to major bridges and tunnels on a highway map). Bottlenecks are, in fact, key connector proteins with surprising functional and dynamic properties. In particular, they are more likely to be essential proteins. In fact, in regulatory and other directed networks, betweenness (i.e., “bottleneck-ness”) is a much more significant indicator of essentiality than degree (i.e., “hub-ness”). Furthermore, bottlenecks correspond to the dynamic components of the interaction network—they are significantly less well coexpressed with their neighbors than nonbottlenecks, implying that expression dynamics is wired into the network topology.  相似文献   

14.
Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or “fold”). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies.  相似文献   

15.
The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the “back catalog” of enzymology – “orphan enzymes,” those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC) database alone. In this study, we demonstrate how this orphan enzyme “back catalog” is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis) to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology’s “back catalog” another powerful tool to drive accurate genome annotation.  相似文献   

16.
Correlated mutation analysis (CMA) is an effective approach for predicting functional and structural residue interactions from multiple sequence alignments (MSAs) of proteins. As nearby residues may also play a role in a given functional interaction, we were interested in seeing whether covarying sites were clustered, and whether this could be used to enhance the predictive power of CMA. A large‐scale search for coevolving regions within protein domains revealed that if two sites in a MSA covary, then neighboring sites in the alignment also typically covary, resulting in clusters of covarying residues. The program PatchD( http://www.uhnres.utoronto.ca/labs/tillier/ ) was developed to measure the covariation between disconnected sequence clusters to reveal patch covariation. Patches that exhibit strong covariation identify multiple residues that are generally nearby in the protein structure, suggesting that the detection of covarying patches can be used in conjunction with traditional CMA approaches to reveal functional interaction partners. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

17.
In vaccinia virus-infected cell cultures, cellular protein synthesis was inhibited 50% at 2 hr postinfection (PI) and 80 to 90% by 4 hr PI. Input virus was responsible for this inhibition. Five early proteins, coded for by the viral genome, could be detected at 2 to 3 hr PI. Normally, their synthesis did not continue beyond 6 hr PI, at which time synthesis of a different set of proteins began. When DNA replication was blocked, synthesis of these early proteins continued until 9 to 12 hr PI. The bulk of the proteins which were incorporated into mature virus were synthesized at 8 hr PI and thereafter. The time of their formation was close to the time at which virus maturation occurred. However, 15% of the protein found in mature virus was synthesized early in the infectious cycle. The quantity of “early viral protein” which was not incorporated into mature virus was almost as large as the quantity of viral protein which did appear in mature virus. The “early” and “late” proteins could be shown to have separate and distinct immunological properties. The role of this large quantity of “early” protein is discussed.  相似文献   

18.
The metabolic stability is a very important idiosyncracy of proteins that is related to their global flexibility, intramolecular fluctuations, various internal dynamic processes, as well as many marvelous biological functions. Determination of protein''s metabolic stability would provide us with useful information for in-depth understanding of the dynamic action mechanisms of proteins. Although several experimental methods have been developed to measure protein''s metabolic stability, they are time-consuming and more expensive. Reported in this paper is a computational method, which is featured by (1) integrating various properties of proteins, such as biochemical and physicochemical properties, subcellular locations, network properties and protein complex property, (2) using the mRMR (Maximum Relevance & Minimum Redundancy) principle and the IFS (Incremental Feature Selection) procedure to optimize the prediction engine, and (3) being able to identify proteins among the four types: “short”, “medium”, “long”, and “extra-long” half-life spans. It was revealed through our analysis that the following seven characters played major roles in determining the stability of proteins: (1) KEGG enrichment scores of the protein and its neighbors in network, (2) subcellular locations, (3) polarity, (4) amino acids composition, (5) hydrophobicity, (6) secondary structure propensity, and (7) the number of protein complexes the protein involved. It was observed that there was an intriguing correlation between the predicted metabolic stability of some proteins and the real half-life of the drugs designed to target them. These findings might provide useful insights for designing protein-stability-relevant drugs. The computational method can also be used as a large-scale tool for annotating the metabolic stability for the avalanche of protein sequences generated in the post-genomic age.  相似文献   

19.

Background

The RAG encoded proteins, RAG-1 and RAG-2 regulate site-specific recombination events in somatic immune B- and T-lymphocytes to generate the acquired immune repertoire. Catalytic activities of the RAG proteins are related to the recombinase functions of a pre-existing mobile DNA element in the DDE recombinase/RNAse H family, sometimes termed the “RAG transposon”.

Methodology/Principal Findings

Novel to this work is the suggestion that the DDE recombinase responsible for the origins of acquired immunity was encoded by a primordial herpes virus, rather than a “RAG transposon.” A subsequent “arms race” between immunity to herpes infection and the immune system obscured primary amino acid similarities between herpes and immune system proteins but preserved regulatory, structural and functional similarities between the respective recombinase proteins. In support of this hypothesis, evidence is reviewed from previous published data that a modern herpes virus protein family with properties of a viral recombinase is co-regulated with both RAG-1 and RAG-2 by closely linked cis-acting co-regulatory sequences. Structural and functional similarity is also reviewed between the putative herpes recombinase and both DDE site of the RAG-1 protein and another DDE/RNAse H family nuclease, the Argonaute protein component of RISC (RNA induced silencing complex).

Conclusions/Significance

A “co-regulatory” model of the origins of V(D)J recombination and the acquired immune system can account for the observed linked genomic structure of RAG-1 and RAG-2 in non-vertebrate organisms such as the sea urchin that lack an acquired immune system and V(D)J recombination. Initially the regulated expression of a viral recombinase in immune cells may have been positively selected by its ability to stimulate innate immunity to herpes virus infection rather than V(D)J recombination Unlike the “RAG-transposon” hypothesis, the proposed model can be readily tested by comparative functional analysis of herpes virus replication and V(D)J recombination.  相似文献   

20.
A new method for the classification of domain movements in proteins is described and applied to 1822 pairs of structures from the Protein Data Bank that represent a domain movement in two-domain proteins. The method is based on changes in contacts between residues from the two domains in moving from one conformation to the other. We argue that there are five types of elemental contact changes and that these relate to five model domain movements called: “free”, “open-closed”, “anchored”, “sliding-twist”, and “see-saw.” A directed graph is introduced called the “Dynamic Contact Graph” which represents the contact changes in a domain movement. In many cases a graph, or part of a graph, provides a clear visual metaphor for the movement it represents and is a motif that can be easily recognised. The Dynamic Contact Graphs are often comprised of disconnected subgraphs indicating independent regions which may play different roles in the domain movement. The Dynamic Contact Graph for each domain movement is decomposed into elemental Dynamic Contact Graphs, those that represent elemental contact changes, allowing us to count the number of instances of each type of elemental contact change in the domain movement. This naturally leads to sixteen classes into which the 1822 domain movements are classified.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号