首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Although many naturally occurring proteins consist of multiple domains, most studies on protein folding to date deal with single-domain proteins or isolated domains of multi-domain proteins. Studies of multi-domain protein folding are required for further advancing our understanding of protein folding mechanisms. Borrelia outer surface protein A (OspA) is a β-rich two-domain protein, in which two globular domains are connected by a rigid and stable single-layer β-sheet. Thus, OspA is particularly suited as a model system for studying the interplays of domains in protein folding. Here, we studied the equilibria and kinetics of the urea-induced folding–unfolding reactions of OspA probed with tryptophan fluorescence and ultraviolet circular dichroism. Global analysis of the experimental data revealed compelling lines of evidence for accumulation of an on-pathway intermediate during kinetic refolding and for the identity between the kinetic intermediate and a previously described equilibrium unfolding intermediate. The results suggest that the intermediate has the fully native structure in the N-terminal domain and the single layer β-sheet, with the C-terminal domain still unfolded. The observation of the productive on-pathway folding intermediate clearly indicates substantial interactions between the two domains mediated by the single-layer β-sheet. We propose that a rigid and stable intervening region between two domains creates an overlap between two folding units and can energetically couple their folding reactions.  相似文献   

2.
Having multiple domains in proteins can lead to partial folding and increased aggregation. Folding cooperativity, the all or nothing folding of a protein, can reduce this aggregation propensity. In agreement with bulk experiments, a coarse-grained structure-based model of the three-domain protein, E. coli Adenylate kinase (AKE), folds cooperatively. Domain interfaces have previously been implicated in the cooperative folding of multi-domain proteins. To understand their role in AKE folding, we computationally create mutants with deleted inter-domain interfaces and simulate their folding. We find that inter-domain interfaces play a minor role in the folding cooperativity of AKE. On further analysis, we find that unlike other multi-domain proteins whose folding has been studied, the domains of AKE are not singly-linked. Two of its domains have two linkers to the third one, i.e., they are inserted into the third one. We use circular permutation to modify AKE chain-connectivity and convert inserted-domains into singly-linked domains. We find that domain insertion in AKE achieves the following: (1) It facilitates folding cooperativity even when domains have different stabilities. Insertion constrains the N- and C-termini of inserted domains and stabilizes their folded states. Therefore, domains that perform conformational transitions can be smaller with fewer stabilizing interactions. (2) Inter-domain interactions are not needed to promote folding cooperativity and can be tuned for function. In AKE, these interactions help promote conformational dynamics limited catalysis. Finally, using structural bioinformatics, we suggest that domain insertion may also facilitate the cooperative folding of other multi-domain proteins.  相似文献   

3.
Here we present a comparison between protein fragments produced by limited proteolysis and those identified by computational cutting based on the building block folding model. The principles upon which the two methods are based are different. Limited proteolysis of natively folded proteins occurs at flexible sites and never at the level of chain segments of regular secondary structure such as alpha-helices. Therefore, the targets for limited proteolysis are locally unfolded regions. In contrast, the computational cutting algorithm considers the compactness of the fragments, their nonpolar buried surface area, and their isolatedness, that is, the surface area which was buried prior to the cutting and becomes exposed subsequently. Despite the different criteria, there is an overall correspondence between sites or regions of limited proteolysis with those identified by computational cutting. The computational cutting method has been applied to several model proteins for which detailed limited proteolysis data are available, namely apomyoglobin, cytochrome c, ribonuclease A, alpha-lactalbumin, and thermolysin. As expected, more cuts are obtained computationally than experimentally and the agreement is better when a number of proteolytic enzymes are used. For example, cytochrome c is cleaved by thermolysin at 56-57, 45-46, and at 80-81, and by proteinase K at 48-49 and 50-51. Incubation of the noncovalent and native-like complex of cytochrome c fragments 1-56 and 57-104 with proteinase K yielded the gapped protein species 1-48/57-104 and finally 1-40/57-104. Computational cutting of cytochrome c reproduced the major experimental observations, with cuts at 47, 64-65 or 65-66 and 80-81 and an unstable 32-47 region not assigned to any building block. The next step, not addressed in this work, is to probe the ability of the generated fragments to fold independently. Since both the computational algorithm and limited proteolysis attempt to dissect the protein folding problem, the general agreement between the two procedures is gratifying. This consistency allows us to propose the use of limited proteolysis to produce protein fragments that can adopt an independent folding and, therefore, to study folding intermediates. The results of the present study appear to validate the building block folding model and are in line with the proposal that protein folding is a hierarchical process, where parts constituting local minima of energy fold first, with their subsequent association and mutual stabilization to finally yield the global fold.  相似文献   

4.
Structural genomic projects envision almost routine protein structure determinations, which are currently imaginable only for small proteins with molecular weights below 25,000 Da. For larger proteins, structural insight can be obtained by breaking them into small segments of amino acid sequences that can fold into native structures, even when isolated from the rest of the protein. Such segments are autonomously folding units (AFU) and have sizes suitable for fast structural analyses. Here, we propose to expand an intuitive procedure often employed for identifying biologically important domains to an automatic method for detecting putative folded protein fragments. The procedure is based on the recognition that large proteins can be regarded as a combination of independent domains conserved among diverse organisms. We thus have developed a program that reorganizes the output of BLAST searches and detects regions with a large number of similar sequences. To automate the detection process, it is reduced to a simple geometrical problem of recognizing rectangular shaped elevations in a graph that plots the number of similar sequences at each residue of a query sequence. We used our program to quantitatively corroborate the premise that segments with conserved sequences correspond to domains that fold into native structures. We applied our program to a test data set composed of 99 amino acid sequences containing 150 segments with structures listed in the Protein Data Bank, and thus known to fold into native structures. Overall, the fragments identified by our program have an almost 50% probability of forming a native structure, and comparable results are observed with sequences containing domain linkers classified in SCOP. Furthermore, we verified that our program identifies AFU in libraries from various organisms, and we found a significant number of AFU candidates for structural analysis, covering an estimated 5 to 20% of the genomic databases. Altogether, these results argue that methods based on sequence similarity can be useful for dissecting large proteins into small autonomously folding domains, and such methods may provide an efficient support to structural genomics projects.  相似文献   

5.
Protein insolubility is a major problem when producing recombinant proteins (e.g., to be used as antigens) from large cDNAs in Escherichia coli. Here, we describe a system using three convertible plasmid vectors to screen for soluble proteins produced in E. coli. This system experimentally identified any random cDNA fragments producing soluble protein domains. Shotgun fragments introduced into any of our three plasmids, which contain Gateway recombination sites, fused in-frame to the ORF of the protein tag. These plasmids produced N-terminal GST- and C-terminal three-frame-adaptive FLAG-tagged proteins, kanamycin-resistant gene-tagged proteins (which were pre-selected for in-frame fused cDNAs), or GFP-tagged fusion proteins. The latter is useful as a fluorescence indicator of protein folding. The Gateway recombination sites promote smooth conversion for enrichment of in-frame clones and facilitate both protein solubility assays and final production of proteins without the C-terminal tag. This high-throughput screening method is particularly useful for procedures that require the handling of many cDNAs in parallel.  相似文献   

6.
Although the vast majority of the human proteome is represented by multi-domain proteins, the study of multi-domain folding and misfolding is a relatively poorly explored field. The protein Whirlin is a multi-domain scaffolding protein expressed in the inner ear. It is characterized by the presence of tandem repeats of PDZ domains. The first two PDZ domains of Whirlin (PDZ1 and PDZ2 – namely P1P2) are structurally close and separated by a disordered short linker. We recently described the folding mechanism of the P1P2 tandem. The difference in thermodynamic stability of the two domains allowed us to selectively unfold one or both PDZ domains and to pinpoint the accumulation of a misfolded intermediate, which we demonstrated to retain physiological binding activity. In this work, we provide an extensive characterization of the folding and unfolding of P1P2. Based on the observed data, we describe an integrated kinetic analysis that satisfactorily fits the experiments and provides a valuable model to interpret multi-domain folding. The experimental and analytical approaches described in this study may be of general interest for the interpretation of complex multi-domain protein folding kinetics.  相似文献   

7.
Limited proteolysis experiments can be successfully used to probe conformational features of proteins. In a number of studies it has been demonstrated that the sites of limited proteolysis along the polypeptide chain of a protein are characterized by enhanced backbone flexibility, implying that proteolytic probes can pinpoint the sites of local unfolding in a protein chain. Limited proteolysis was used to analyze the partly folded (molten globule) states of several proteins, such as apomyoglobin, alpha-lactalbumin, calcium-binding lysozymes, cytochrome c and human growth hormone. These proteins were induced to acquire the molten globule state under specific solvent conditions, such as low pH. In general, the protein conformational features deduced from limited proteolysis experiments nicely correlate with those deriving from other biophysical and spectroscopic techniques. Limited proteolysis is also most useful for isolating protein fragments that can fold autonomously and thus behave as protein domains. Moreover, the technique can be used to identify and prepare protein fragments that are able to associate into a native-like and often functional protein complex. Overall, our results underscore the utility of the limited proteolysis approach for unravelling molecular features of proteins and appear to prompt its systematic use as a simple first step in the elucidation of structure-dynamics-function relationships of a novel and rare protein, especially if available in minute amounts.  相似文献   

8.
The structural domains of proteins have often been identified through the use of limited proteolysis. In structural genomics studies, it is necessary to carry this out in a high-throughput manner. Here, we constructed a novel high-throughput system, which consists of cell-free protein expression and one-step affinity purification, followed by limited proteolysis using a unique new method, referred to “on beads method”. All these steps were carried out on 96-well plate formats and completed in two days, even by manual handling. The merits of the new method versus the conventional one are as follows: (1) experimental times are reduced, (2) the sample preparation for limited proteolysis experiments is simplified, and (3) both protein purification and limited digestion can be performed “in situ” on the same sample plate. This preparation method is therefore suitable for highly automated, proteolytic analyses coupled to mass spectrometry techniques at a micro-scale protein expression level. The resulting protease-resistant fragments were analyzed by MALDI-TOF-MS and protein domains of 34 mouse cDNA products were identified with this system.  相似文献   

9.
The identification and annotation of protein domains provides a critical step in the accurate determination of molecular function. Both computational and experimental methods of protein structure determination may be deterred by large multi-domain proteins or flexible linker regions. Knowledge of domains and their boundaries may reduce the experimental cost of protein structure determination by allowing researchers to work on a set of smaller and possibly more successful alternatives. Current domain prediction methods often rely on sequence similarity to conserved domains and as such are poorly suited to detect domain structure in poorly conserved or orphan proteins. We present here a simple computational method to identify protein domain linkers and their boundaries from sequence information alone. Our domain predictor, Armadillo (http://armadillo.blueprint.org), uses any amino acid index to convert a protein sequence to a smoothed numeric profile from which domains and domain boundaries may be predicted. We derived an amino acid index called the domain linker propensity index (DLI) from the amino acid composition of domain linkers using a non-redundant structure dataset. The index indicates that Pro and Gly show a propensity for linker residues while small hydrophobic residues do not. Armadillo predicts domain linker boundaries from Z-score distributions and obtains 35% sensitivity with DLI in a two-domain, single-linker dataset (within +/-20 residues from linker). The combination of DLI and an entropy-based amino acid index increases the overall Armadillo sensitivity to 56% for two domain proteins. Moreover, Armadillo achieves 37% sensitivity for multi-domain proteins, surpassing most other prediction methods. Armadillo provides a simple, but effective method by which prediction of domain boundaries can be obtained with reasonable sensitivity. Armadillo should prove to be a valuable tool for rapidly delineating protein domains in poorly conserved proteins or those with no sequence neighbors. As a first-line predictor, domain meta-predictors could yield improved results with Armadillo predictions.  相似文献   

10.
Limited proteolysis is widely used in biochemical and crystallographic studies to determine domain organization, folding properties, and ligand binding activities of proteins. The method has limitations, however, due to the difficulties in obtaining sufficient amounts of correctly folded proteins and in interpreting the results of the proteolysis. A new limited proteolysis method, named protease accessibility laddering (PAL), avoids these complications. In PAL, tagged proteins are purified on magnetic beads in their natively folded state. While attached to the beads, proteins are probed with proteases. Proteolytic fragments are eluted and detected by immunoblotting with antibodies against the tag (e.g., Protein A, GFP, and 6xHis). PAL readily detects domain boundaries and flexible loops within proteins. A combination of PAL and comparative protein structure modeling allows characterization of previously unknown structures (e.g., Sec31, a component of the COPII coated vesicle). PAL's high throughput should greatly facilitate structural genomic and proteomic studies.  相似文献   

11.
Eukaryotic genomes encode a considerably higher fraction of multi-domain proteins than their prokaryotic counterparts. It has been postulated that efficient co-translational and sequential domain folding has facilitated the explosive evolution of multi-domain proteins in eukaryotes by the recombination of pre-existent domains. Here, we tested whether eukaryotes and bacteria differ generally in the folding efficiency of multi-domain proteins generated by domain recombination. To this end, we compared the folding behavior of a series of recombinant proteins comprised of green fluorescent protein (GFP) fused to four different robustly folding proteins through six different linkers upon expression in Escherichia coli and the yeast Saccharomyces cerevisiae. We found that, unlike yeast, bacteria are remarkably inefficient at folding these fusion proteins, even at comparable levels of expression. In vitro and in vivo folding experiments demonstrate that the GFP domain imposes significant constraints on de novo folding of its fusion partners in bacteria, consistent with a largely post-translational folding mechanism. This behavior may result from an interference of GFP with adjacent domains during folding due to the particular topology of the beta-barrel GFP structure. By following the accumulation of enzymatic activity, we found that the rate of appearance of correctly folded fusion protein per ribosome is indeed considerably higher in yeast than in bacteria.  相似文献   

12.
Domains are considered as the basic units of protein folding, evolution, and function. Decomposing each protein into modular domains is thus a basic prerequisite for accurate functional classification of biological molecules. Here, we present ADDA, an automatic algorithm for domain decomposition and clustering of all protein domain families. We use alignments derived from an all-on-all sequence comparison to define domains within protein sequences based on a global maximum likelihood model. In all, 90% of domain boundaries are predicted within 10% of domain size when compared with the manual domain definitions given in the SCOP database. A representative database of 249,264 protein sequences were decomposed into 450,462 domains. These domains were clustered on the basis of sequence similarities into 33,879 domain families containing at least two members with less than 40% sequence identity. Validation against family definitions in the manually curated databases SCOP and PFAM indicates almost perfect unification of various large domain families while contamination by unrelated sequences remains at a low level. The global survey of protein-domain space by ADDA confirms that most large and universal domain families are already described in PFAM and/or SMART. However, a survey of the complete set of mobile modules leads to the identification of 1479 new interesting domain families which shuffle around in multi-domain proteins. The data are publicly available at ftp://ftp.ebi.ac.uk/pub/contrib/heger/adda.  相似文献   

13.
Most protein domains are found in multi-domain proteins, yet most studies of protein folding have concentrated on small, single-domain proteins or on isolated domains from larger proteins. Spectrin domains are small (106 amino acid residues), independently folding domains consisting of three long alpha-helices. They are found in multi-domain proteins with a number of spectrin domains in tandem array. Structural studies have shown that in these arrays the last helix of one domain forms a continuous helix with the first helix of the following domain. It has been demonstrated that a number of spectrin domains are stabilised by their neighbours. Here we investigate the molecular basis for cooperativity between adjacent spectrin domains 16 and 17 from chicken brain alpha-spectrin (R16 and R17). We show that whereas the proteins unfold as a single cooperative unit at 25 degrees C, cooperativity is lost at higher temperatures and in the presence of stabilising salts. Mutations in the linker region also cause the cooperativity to be lost. However, the cooperativity does not rely on specific interactions in the linker region alone. Most mutations in the R17 domain cause a decrease in cooperativity, whereas proteins with mutations in the R16 domain still fold cooperatively. We propose a mechanism for this behaviour.  相似文献   

14.
Many membrane proteins feature autonomously folded extramembranous domains which, when isolated from the intact protein, perform biochemical functions relevant to biological activity. Whereas intact membrane proteins usually require detergent solubilization for purification, most extramembranous fragments are soluble in aqueous solution. If appropriately constructed, such fragments are often crystallizable and the resulting atomic structures can lead to important biological insight. In most instances, these fragments are produced in recombinant expression systems. To be crystallizable, molecular fragments should be uniform in composition and conformation and be available in abundance. Considerations for the production of crystallizable fragments of membrane proteins include the definition of fragment boundaries, the control of nonuniformities introduced by glycosylation or phosphorylation, and optimization of expression systems. These aspects are addressed here in general terms and in the case studies of applications to CD4, CD8, the insulin receptor kinase, and N-cadherin.  相似文献   

15.
Morimoto S  Tamura A 《Biochemistry》2004,43(21):6596-6605
We have determined the key regions for protein foldability by creating multiple crossover libraries from two proteins that share similar fold but have low sequence identity and differ significantly in stability. One protein is the propeptide of a serine protease, subtilisin BPN', and the other is Pleurotus ostreatus proteinase A inhibitor 1 (POIA1). The propeptide has a compact structure when complexed with subtilisin but is unstructured when isolated, whereas POIA1 takes a stable structure. We selected four of the conserved amino acid residues for the boundaries of crossover sites and utilized these residues to make same cohesive-ends to assemble synthetic DNA fragments. Each segment has one or two secondary structure units, and the interchange of these structural elements produces 32 (= 2(5)) combinations, including the propeptide and POIA1. The stability of these mutants was first screened by formation of turbid zones on skim milk plates containing subtilisin BPN'. It was shown that six variants were foldable and structural units necessary for folding were identified. Further fragmentation and recombination of these mutants (the "multisection" method) revealed that two interactions between secondary structures are important; one is interaction between the loop-alpha1 and beta2-turn-beta3, and the other is hydrophobic interaction between the adjoining beta1 and beta4 strands. We were also able to specify the significant amino acid combinations for tolerance to proteolysis. These combinatorial methods not only elucidate how domains can be interchanged to make the whole protein foldable but also extract essential regions for the function, which is correlated with the instability of the molecule.  相似文献   

16.
Comparative studies of the proteomes from different organisms have provided valuable information about protein domain distribution in the kingdoms of life. Earlier studies have been limited by the fact that only about 50% of the proteomes could be matched to a domain. Here, we have extended these studies by including less well-defined domain definitions, Pfam-B and clustered domains, MAS, in addition to Pfam-A and SCOP domains. It was found that a significant fraction of these domain families are homologous to Pfam-A or SCOP domains. Further, we show that all regions that do not match a Pfam-A or SCOP domain contain a significantly higher fraction of disordered structure. These unstructured regions may be contained within orphan domains or function as linkers between structured domains. Using several different definitions we have re-estimated the number of multi-domain proteins in different organisms and found that several methods all predict that eukaryotes have approximately 65% multi-domain proteins, while the prokaryotes consist of approximately 40% multi-domain proteins. However, these numbers are strongly dependent on the exact choice of cut-off for domains in unassigned regions. In conclusion, all eukaryotes have similar fractions of multi-domain proteins and disorder, whereas a high fraction of repeating domain is distinguished only in multicellular eukaryotes. This implies a role for repeats in cell-cell contacts while the other two features are important for intracellular functions.  相似文献   

17.
Protein domains are structural and fundamental functional units of proteins. The information of protein domain boundaries is helpful in understanding the evolution, structures and functions of proteins, and also plays an important role in protein classification. In this paper, we propose a support vector regression-based method to address the problem of protein domain boundary identification based on novel input profiles extracted from AAindex database. As a result, our method achieves an average sensitivity of ∼36.5% and an average specificity of ∼81% for multi-domain protein chains, which is overall better than the performance of published approaches to identify domain boundary. As our method used sequence information alone, our method is simpler and faster.  相似文献   

18.
We describe a method to identify protein domain boundaries from sequence information alone based on the assumption that hydrophobic residues cluster together in space. SnapDRAGON is a suite of programs developed to predict domain boundaries based on the consistency observed in a set of alternative ab initio three-dimensional (3D) models generated for a given protein multiple sequence alignment. This is achieved by running a distance geometry-based folding technique in conjunction with a 3D-domain assignment algorithm. The overall accuracy of our method in predicting the number of domains for a non-redundant data set of 414 multiple alignments, representing 185 single and 231 multiple-domain proteins, is 72.4 %. Using domain linker regions observed in the tertiary structures associated with each query alignment as the standard of truth, inter-domain boundary positions are delineated with an accuracy of 63.9 % for proteins comprising continuous domains only, and 35.4 % for proteins with discontinuous domains. Overall, domain boundaries are delineated with an accuracy of 51.8 %. The prediction accuracy values are independent of the pair-wise sequence similarities within each of the alignments. These results demonstrate the capability of our method to delineate domains in protein sequences associated with a wide variety of structural domain organisation.  相似文献   

19.
Ribosomal proteins from Escherichia coli have been isolated by a mild purification procedure. Their tertiary structure has been explored by two techniques, proton magnetic resonance and limited proteolysis. A number of proteins when subjected to limited proteolysis produce resistant fragments in good yields. In most cases this does not depend on the specificity of the enzyme used. The proteins S15, S16, S17 and L30 are not degraded at all, whereas a few proteins are very susceptible to proteolysis. 1H-NMR experiments show that the majority of the ribosomal proteins have a uniquely folded tertiary structure. This is particularly pronounced in the four proteins mentioned above which resist proteolysis. In general, a good agreement is observed between the degree of proteolytic resistance and the amount of folding indicated by NMR spectroscopy. Similar studies on a few ribosomal proteins purified under denaturing conditions show that, in contrast, these protein preparations are not structurally homogeneous and that they contain a mixture of denatured and renatured molecules. The results are interpreted in terms of a compactly folded tertiary structure for the four proteinase-resistant proteins while the majority of the other proteins appear to have two domains, one compactly folded and resistant to proteinase and the other flexible and susceptible to proteolysis. A few proteins seem to have a completely flexible structure and can therefore be easily degraded.  相似文献   

20.
Domains are the evolutionary units that comprise proteins, and most proteins are built from more than one domain. Domains can be shuffled by recombination to create proteins with new arrangements of domains. Using structural domain assignments, we examined the combinations of domains in the proteins of 131 completely sequenced organisms. We found two-domain and three-domain combinations that recur in different protein contexts with different partner domains. The domains within these combinations have a particular functional and spatial relationship. These units are larger than individual domains and we term them "supra-domains". Amongst the supra-domains, we identified some 1400 (1203 two-domain and 166 three-domain) combinations that are statistically significantly over-represented relative to the occurrence and versatility of the individual component domains. Over one-third of all structurally assigned multi-domain proteins contain these over-represented supra-domains. This means that investigation of the structural and functional relationships of the domains forming these popular combinations would be particularly useful for an understanding of multi-domain protein function and evolution as well as for genome annotation. These and other supra-domains were analysed for their versatility, duplication, their distribution across the three kingdoms of life and their functional classes. By examining the three-dimensional structures of several examples of supra-domains in different biological processes, we identify two basic types of spatial relationships between the component domains: the combined function of the two domains is such that either the geometry of the two domains is crucial and there is a tight constraint on the interface, or the precise orientation of the domains is less important and they are spatially separate. Frequently, the role of the supra-domain becomes clear only once the three-dimensional structure is known. Since this is the case for only a quarter of the supra-domains, we provide a list of the most important unknown supra-domains as potential targets for structural genomics projects.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号