首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We describe a method to identify protein domain boundaries from sequence information alone based on the assumption that hydrophobic residues cluster together in space. SnapDRAGON is a suite of programs developed to predict domain boundaries based on the consistency observed in a set of alternative ab initio three-dimensional (3D) models generated for a given protein multiple sequence alignment. This is achieved by running a distance geometry-based folding technique in conjunction with a 3D-domain assignment algorithm. The overall accuracy of our method in predicting the number of domains for a non-redundant data set of 414 multiple alignments, representing 185 single and 231 multiple-domain proteins, is 72.4 %. Using domain linker regions observed in the tertiary structures associated with each query alignment as the standard of truth, inter-domain boundary positions are delineated with an accuracy of 63.9 % for proteins comprising continuous domains only, and 35.4 % for proteins with discontinuous domains. Overall, domain boundaries are delineated with an accuracy of 51.8 %. The prediction accuracy values are independent of the pair-wise sequence similarities within each of the alignments. These results demonstrate the capability of our method to delineate domains in protein sequences associated with a wide variety of structural domain organisation.  相似文献   

2.
Domains are the main structural and functional units of larger proteins. They tend to be contiguous in primary structure and can fold and function independently. It has been observed that 10–20% of all encoded proteins contain duplicated domains and the average pairwise sequence identity between them is usually low. In the present study, we have analyzed the structural similarity between domain repeats of proteins with known structures available in the Protein Data Bank using structure-based inter-residue interaction measures such as the number of long-range contacts, surrounding hydrophobicity, and pairwise interaction energy. We used RADAR program for detecting the repeats in a protein sequence which were further validated using Pfam domain assignments. The sequence identity between the repeats in domains ranges from 20 to 40% and their secondary structural elements are well conserved. The number of long-range contacts, surrounding hydrophobicity calculations and pairwise interaction energy of the domain repeats clearly reveal the conservation of 3-D structure environment in the repeats of domains. The proportions of mainchain–mainchain hydrogen bonds and hydrophobic interactions are also highly conserved between the repeats. The present study has suggested that the computation of these structure-based parameters will give better clues about the tertiary environment of the repeats in domains. The folding rates of individual domains in the repeats predicted using the long-range order parameter indicate that the predicted folding rates correlate well with most of the experimentally observed folding rates for the analyzed independently folded domains.  相似文献   

3.
Although many naturally occurring proteins consist of multiple domains, most studies on protein folding to date deal with single-domain proteins or isolated domains of multi-domain proteins. Studies of multi-domain protein folding are required for further advancing our understanding of protein folding mechanisms. Borrelia outer surface protein A (OspA) is a β-rich two-domain protein, in which two globular domains are connected by a rigid and stable single-layer β-sheet. Thus, OspA is particularly suited as a model system for studying the interplays of domains in protein folding. Here, we studied the equilibria and kinetics of the urea-induced folding–unfolding reactions of OspA probed with tryptophan fluorescence and ultraviolet circular dichroism. Global analysis of the experimental data revealed compelling lines of evidence for accumulation of an on-pathway intermediate during kinetic refolding and for the identity between the kinetic intermediate and a previously described equilibrium unfolding intermediate. The results suggest that the intermediate has the fully native structure in the N-terminal domain and the single layer β-sheet, with the C-terminal domain still unfolded. The observation of the productive on-pathway folding intermediate clearly indicates substantial interactions between the two domains mediated by the single-layer β-sheet. We propose that a rigid and stable intervening region between two domains creates an overlap between two folding units and can energetically couple their folding reactions.  相似文献   

4.
In this article, we present a de novo method for predicting protein domain boundaries, called OPUS-Dom. The core of the method is a novel coarse-grained folding method, VECFOLD, which constructs low-resolution structural models from a target sequence by folding a chain of vectors representing the predicted secondary-structure elements. OPUS-Dom generates a large ensemble of folded structure decoys by VECFOLD and labels the domain boundaries of each decoy by a domain parsing algorithm. Consensus domain boundaries are then derived from the statistical distribution of the putative boundaries and three empirical sequence-based domain profiles. OPUS-Dom generally outperformed several state-of-the-art domain prediction algorithms over various benchmark protein sets. Even though each VECFOLD-generated structure contains large errors, collectively these structures provide a more robust delineation of domain boundaries. The success of OPUS-Dom suggests that the arrangement of protein domains is more a consequence of limited coordination patterns per domain arising from tertiary packing of secondary-structure segments, rather than sequence-specific constraints.  相似文献   

5.
The identification of protein domains within multi-domain proteins is a persistent problem. Here, we describe an experimental method (shotgun proteolysis) based on random DNA fragmentation and protease selection of the encoded polypeptides on phage for this purpose. We applied the method to the Escherichia coli genome and identified 124 protease-resistant fragments; several were re-cloned for expression as soluble fragments in bacteria, and corresponded to autonomously folding units with folding energies similar to natural protein domains (DeltaG(u)=3.8-6.6 kcal/mol). Structural information was available for approximately half of the selected proteins, which corresponded to compact, globular and domain-sized units that had been derived from a wide range of protein superfamilies. Furthermore, boundaries of the selected fragments correlated with domain boundaries as defined by bioinformatics predictions (R2=0.82; p=0.016). However, predictions were incomplete or entirely lacking for the remaining fragments, reflecting the limited proteome coverage of current bioinformatics methods. Shotgun proteolysis therefore provides a means to identify domains and other autonomously folding units on a genome-wide scale, without any prior knowledge of sequence or structure. Shotgun proteolysis should be particularly valuable for structural studies of proteins and represents a high-throughput alternative to the classical limited proteolysis method for the isolation of stable components of multi-domain proteins.  相似文献   

6.
Multidomain proteins continue to be a major challenge in protein structure prediction. Here we present a Monte Carlo (MC) algorithm, implemented within Rosetta, to predict the structure of proteins in which one domain is inserted into another. Three MC moves combine rigid-body and loop movements to search the constrained conformation by structure disruption and subsequent repair of chain breaks. Local searches find that the algorithm samples and recovers near-native structures consistently. Further global searches produced top-ranked structures within 5 A in 31 of 50 cases in low-resolution mode, and refinement of top-ranked low-resolution structures produced models within 2 A in 21 of 50 cases. Rigid-body orientations were often correctly recovered despite errors in linker conformation. The algorithm is broadly applicable to de novo structure prediction of both naturally occurring and engineered domain insertion proteins.  相似文献   

7.
Receptor tyrosine kinases (RTKs) are single-span transmembrane receptors in which relatively conserved intracellular kinase domains are coupled to divergent extracellular modules. The extracellular domains initiate receptor signaling upon binding to either soluble or membrane-embedded ligands. The diversity of extracellular domain structures allows for coupling of many unique signaling inputs to intracellular tyrosine phosphorylation. The combinatorial power of this receptor system is further increased by the fact that multiple ligands can typically interact with the same receptor. Such ligands often act as biased agonists and initiate distinct signaling responses via activation of the same receptor. Mechanisms behind such biased agonism are largely unknown for RTKs, especially at the level of receptor–ligand complex structure. Using recent progress in understanding the structures of active RTK signaling units, we discuss selected mechanisms by which ligands couple receptor activation to distinct signaling outputs.  相似文献   

8.
Protein–protein interactions are thought to be mediated by domains, which are autonomous folding units of proteins. Recently, a second type of interaction has been suggested, mediated by short segments termed linear motifs, which are related to recognition elements of intrinsically disordered regions. Here, we propose a third kind of protein–protein recognition mechanism, mediated by disordered regions longer than 20–30 residues. Bioinformatics predictions and well‐characterized examples, such as the kinase‐inhibitory domain of Cdk inhibitors and the Wiskott–Aldrich syndrome protein (WASP)‐homology domain 2 of actin‐binding proteins, show that these disordered regions conform to the definition of domains rather than motifs, i.e., they represent functional, evolutionary, and structural units. Their functions are distinct from those of short motifs and ordered domains, and establish a third kind of interaction principle. With these points, we argue that these long disordered regions should be recognized as a distinct class of biologically functional protein domains.  相似文献   

9.
It is known that larger globular proteins are built from domains, relatively independent structural units. A domain size seems to be limited, and a single domain consists of from few tens to a couple of hundred amino acids. Based on Monte Carlo simulations of a reduced protein model restricted to the face centered simple cubic lattice, with a minimal set of short-range and long-range interactions, we have shown that some model sequences upon the folding transition spontaneously divide into separate domains. The observed domain sizes closely correspond to the sizes of real protein domains. Short chains with a proper sequence pattern of the hydrophobic and polar residues undergo a two-state folding transition to the structurally ordered globular state, while similar longer sequences follow a multistate transition. Homopolymeric (uniformly hydrophobic) chains and random heteropolymers undergo a continuous collapse transition into a single globule, and the globular state is much less ordered. Thus, the factors responsible for the multidomain structure of proteins are sufficiently long polypeptide chain and characteristic, protein-like, sequence patterns. These findings provide some hints for the analysis of real sequences aimed at prediction of the domain structure of large proteins.  相似文献   

10.
Many proteins are composed of several domains that pack together into a complex tertiary structure. Multidomain proteins can be challenging for protein structure modeling, particularly those for which templates can be found for individual domains but not for the entire sequence. In such cases, homology modeling can generate high quality models of the domains but not for the orientations between domains. Small-angle X-ray scattering (SAXS) reports the structural properties of entire proteins and has the potential for guiding homology modeling of multidomain proteins. In this article, we describe a novel multidomain protein assembly modeling method, SAXSDom that integrates experimental knowledge from SAXS with probabilistic Input-Output Hidden Markov model to assemble the structures of individual domains together. Four SAXS-based scoring functions were developed and tested, and the method was evaluated on multidomain proteins from two public datasets. Incorporation of SAXS information improved the accuracy of domain assembly for 40 out of 46 critical assessment of protein structure prediction multidomain protein targets and 45 out of 73 multidomain protein targets from the ab initio domain assembly dataset. The results demonstrate that SAXS data can provide useful information to improve the accuracy of domain-domain assembly. The source code and tool packages are available at https://github.com/jianlin-cheng/SAXSDom .  相似文献   

11.
With a growing number of structures available in the Brookhaven Protein Data Bank, automatic methods for domain identification are required for the construction of databases. Domains are considered to be clusters of secondary structure elements. Thus, helices and strands are first clustered using intersecondary structural distances between C alpha positions, and dendrograms based on this distance measure are used to identify domains. Individual domains are recognized by a disjoint factor, which enables the automatic identification and classification into disjoint, interacting, and conjoint domains. Application to a database of 83 protein families and 18 unique structures shows that the approach provides an effective delineation of boundaries and identifies those proteins that can be considered as a single domain. A quantitative estimate of the interaction between domains has been proposed. The database of protein domains is a useful tool for understanding protein folding, for recognizing protein folds, and for understanding structure-activity relationships.  相似文献   

12.
MOTIVATION: Although many methods are available for the identification of structural domains from protein three-dimensional structures, accurate definition of protein domains and the curation of such data for a large number of proteins are often possible only after manual intervention. The availability of domain definitions for protein structural entries is useful for the sequence analysis of aligned domains, structure comparison, fold recognition procedures and understanding protein folding, domain stability and flexibility. RESULTS: We have improved our method of domain identification starting from the concept of clustering secondary structural elements, but with an intention of reducing the number of discontinuous segments in identified domains. The results of our modified and automatic approach have been compared with the domain definitions from other databases. On a test data set of 55 proteins, this method acquires high agreement (88%) in the number of domains with the crystallographers' definition and resources such as SCOP, CATH, DALI, 3Dee and PDP databases. This method also obtains 98% overlap score with the other resources in the definition of domain boundaries of the 55 proteins. We have examined the domain arrangements of 4592 non-redundant protein chains using the improved method to include 5409 domains leading to an update of the structural domain database. AVAILABILITY: The latest version of the domain database and online domain identification methods are available from http://www.ncbs.res.in/~faculty/mini/ddbase/ddbase.html Supplementary information: http://www.ncbs.res.in/~faculty/mini/ddbase/supplementary/supplementary.html  相似文献   

13.
Chromosomes of higher eukaryotes are thought to be organized into a series of discrete and topologically independent higher-order domains. In addition to providing a mechanism for chromatin compaction, these higher-order domains are thought to define independent units of gene activity. Implicit in most models for the folding of the chromatin fiber are special nucleoprotein structures, the domain boundaries, which serve to delimit each higher-order chromosomal domain. We have used an "enhancer-blocking assay" to test putative domain boundaries for boundary function in vivo. This assay is based on the notion that in delimiting independent units of gene activity, domain boundaries should be able to restrict the scope of activity of enhancer elements to genes which reside within the same domain. In this case, interposing a boundary between an enhancer and a promoter should block the action of the enhancer. In the experiments reported here, we have used the yolk protein-1 enhancer element and an hsp70 promoter:lacZ fusion gene to test putative boundary DNA segments for enhancer-blocking activity. We have found that several scs-like elements are capable of blocking the action of the yp-1 enhancer when placed between it and the hsp70 promoter. In contrast, a MAR/SAR DNA segment and another spacer DNA segment had no apparent effect on enhancer activity.  相似文献   

14.
Having multiple domains in proteins can lead to partial folding and increased aggregation. Folding cooperativity, the all or nothing folding of a protein, can reduce this aggregation propensity. In agreement with bulk experiments, a coarse-grained structure-based model of the three-domain protein, E. coli Adenylate kinase (AKE), folds cooperatively. Domain interfaces have previously been implicated in the cooperative folding of multi-domain proteins. To understand their role in AKE folding, we computationally create mutants with deleted inter-domain interfaces and simulate their folding. We find that inter-domain interfaces play a minor role in the folding cooperativity of AKE. On further analysis, we find that unlike other multi-domain proteins whose folding has been studied, the domains of AKE are not singly-linked. Two of its domains have two linkers to the third one, i.e., they are inserted into the third one. We use circular permutation to modify AKE chain-connectivity and convert inserted-domains into singly-linked domains. We find that domain insertion in AKE achieves the following: (1) It facilitates folding cooperativity even when domains have different stabilities. Insertion constrains the N- and C-termini of inserted domains and stabilizes their folded states. Therefore, domains that perform conformational transitions can be smaller with fewer stabilizing interactions. (2) Inter-domain interactions are not needed to promote folding cooperativity and can be tuned for function. In AKE, these interactions help promote conformational dynamics limited catalysis. Finally, using structural bioinformatics, we suggest that domain insertion may also facilitate the cooperative folding of other multi-domain proteins.  相似文献   

15.
Compactness has been used to locate discontinuous structural units containing one or more polypeptide chains in proteins of known structure. Rather than exhaustively calculating the compactness of all possible units, our procedure uses a screening algorithm to find discontinuous regions that are potentially compact. Precise calculations of compactness are restricted only to units in these regions. With our procedure, compactness can be used to discover discontinuous domains with virtually any number of disjoint peptides. Small, single-domain proteins may contain several compact regions: thus, compact regions do not always correspond to folding domains. Because a domain is an independent folding unit and should contain a hydrophobic core, compact units were further examined for the presence of hydrophobic clusters (Zehfus MH, 1995, Protein Sci 4:1188-1202). This added constraint limits the number of acceptable units and helps greatly in the location of the true structural domains. The larger hydrophobically stabilized compact units correspond to domains, while the smaller units may correspond to folding intermediates.  相似文献   

16.
The amino acid sequence of ERp57, which functions in the endoplasmic reticulum together with the lectins calreticulin and calnexin to achieve folding of newly synthesized glycoproteins, is highly similar to that of protein disulfide isomerase (PDI), but they have their own distinct roles in protein folding. We have characterized the domain structure of ERp57 by limited proteolysis and N-terminal sequencing and have found it to be similar but not identical to that of PDI. ERp57 had three major protease-sensitive regions, the first of which was located between residues 120 and 150, the second between 201 and 215, and the third between 313 and 341, the data thus being consistent with a four-domain structure abb'a'. Recombinant expression in Escherichia coli was used to verify the domain boundaries. Each single domain and a b'a' double domain could be produced in the form of soluble, folded polypeptides, as verified by circular dichroism spectra and urea gradient gel electrophoresis. When the ability of ERp57 and its a and a' domains to fold denatured RNase A was studied by electrospray mass analyses, ERp57 markedly enhanced the folding rate at early time points, although less effectively than PDI, but was an ineffective catalyst of the overall process. The a and a' domains produced only minor, if any, increases in the folding rate at the early stages and no increase at the late stages. Interaction of the soluble ERp57 domains with the P domain of calreticulin was studied by chemical cross-linking in vitro. None of the single ERp57 domains nor the b'a' double domain could be cross-linked to the P domain, whereas cross-linking was obtained with a hybrid ERpabb'PDIa'c polypeptide but not with ERpabPDIb'a'c, indicating that multiple domains are involved in this protein-protein interaction and that the b' domain of ERp57 cannot be replaced by that of PDI.  相似文献   

17.
A phosphoprotein (P) is found in all viruses of the Mononegavirales order. These proteins form homo-oligomers, fulfil similar roles in the replication cycles of the various viruses, but differ in their length and oligomerization state. Sequence alignments reveal no sequence similarity among proteins from viruses belonging to the same family. Sequence analysis and experimental data show that phosphoproteins from viruses of the Paramyxoviridae contain structured domains alternating with intrinsically disordered regions. Here, we used predictions of disorder of secondary structure, and an analysis of sequence conservation to predict the domain organization of the phosphoprotein from Sendai virus, vesicular stomatitis virus (VSV) and rabies virus (RV P). We devised a new procedure for combining the results from multiple prediction methods and locating the boundaries between disordered regions and structured domains. To validate the proposed modular organization predicted for RV P and to confirm that the putative structured domains correspond to autonomous folding units, we used two-hybrid and biochemical approaches to characterize the properties of several fragments of RV P. We found that both central and C-terminal domains can fold in isolation, that the central domain is the oligomerization domain, and that the C-terminal domain binds to nucleocapsids. Our results suggest a conserved organization of P proteins in the Rhabdoviridae family in concatenated functional domains resembling that of the P proteins in the Paramyxoviridae family.  相似文献   

18.
Yan J  Wen W  Xu W  Long JF  Adams ME  Froehner SC  Zhang M 《The EMBO journal》2005,24(23):3985-3995
Pleckstrin homology (PH) domains play diverse roles in cytoskeletal dynamics and signal transduction. Split PH domains represent a unique subclass of PH domains that have been implicated in interactions with complementary partial PH domains 'hidden' in many proteins. Whether partial PH domains exist as independent structural units alone and whether two halves of a split PH domain can fold together to form an intact PH domain are not known. Here, we solved the structure of the PH(N)-PDZ-PH(C) tandem of alpha-syntrophin. The split PH domain of alpha-syntrophin adopts a canonical PH domain fold. The isolated partial PH domains of alpha-syntrophin, although completely unfolded, remain soluble in solution. Mixing of the two isolated domains induces de novo folding and yields a stable PH domain. Our results demonstrate that two complementary partial PH domains are capable of binding to each other to form an intact PH domain. We further showed that the PH(N)-PDZ-PH(C) tandem forms a functionally distinct supramodule, in which the split PH domain and the PDZ domain function synergistically in binding to inositol phospholipids.  相似文献   

19.
Ab initio folding of proteins with all-atom discrete molecular dynamics   总被引:3,自引:0,他引:3  
Discrete molecular dynamics (DMD) is a rapid sampling method used in protein folding and aggregation studies. Until now, DMD was used to perform simulations of simplified protein models in conjunction with structure-based force fields. Here, we develop an all-atom protein model and a transferable force field featuring packing, solvation, and environment-dependent hydrogen bond interactions. We performed folding simulations of six small proteins (20-60 residues) with distinct native structures by the replica exchange method. In all cases, native or near-native states were reached in simulations. For three small proteins, multiple folding transitions are observed, and the computationally characterized thermodynamics are in qualitative agreement with experiments. The predictive power of all-atom DMD highlights the importance of environment-dependent hydrogen bond interactions in modeling protein folding. The developed approach can be used for accurate and rapid sampling of conformational spaces of proteins and protein-protein complexes and applied to protein engineering and design of protein-protein interactions.  相似文献   

20.
Large conformational changes in the LID and NMP domains of adenylate kinase (AKE) are known to be key to ligand binding and catalysis, yet the order of binding events and domain motion is not well understood. Combining the multiple available structures for AKE with the energy landscape theory for protein folding, a theoretical model was developed for allostery, order of binding events, and efficient catalysis. Coarse-grained models and nonlinear normal mode analysis were used to infer that intrinsic structural fluctuations dominate LID motion, whereas ligand-protein interactions and cracking (local unfolding) are more important during NMP motion. In addition, LID-NMP domain interactions are indispensable for efficient catalysis. LID domain motion precedes NMP domain motion, during both opening and closing. These findings provide a mechanistic explanation for the observed 1:1:1 correspondence between LID domain closure, NMP domain closure, and substrate turnover. This catalytic cycle has likely evolved to reduce misligation, and thus inhibition, of AKE. The separation of allosteric motion into intrinsic structural fluctuations and ligand-induced contributions can be generalized to further our understanding of allosteric transitions in other proteins.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号