首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Recent works has suggested that proteins in early evolution have gone through a stage of closed loop elements with a typical contour size of 25-35 residues. These closed loops are still the elementary protein units to these days, and can be used to spell out protein sequence/structure relationship through a relatively small number of protein prototypes. In this study we aimed to identify the sequences that are used to lock the loop ends to one another, and to show how an extensive dictionary of such locking pairs can be created using positional correlation data from a large proteome database, and structural data from PDB databases. Such a dictionary can be used in reconstructing the evolutionary pathway the modern proteins have gone through, and in identifying closed loop elements in modern proteins with yet unknown 3D structure.  相似文献   

2.
Abstract

Recent works has suggested that proteins in early evolution have gone through a stage of closed loop elements with a typical contour size of 25–35 residues. These closed loops are still the elementary protein units to these days, and can be used to spell out protein sequence/structure relationship through a relatively small number of protein prototypes. In this study we aimed to identify the sequences that are used to lock the loop ends to one another, and to show how an extensive dictionary of such locking pairs can be created using positional correlation data from a large proteome database, and structural data from PDB databases. Such a dictionary can be used in reconstructing the evolutionary pathway the modern proteins have gone through, and in identifying closed loop elements in modern proteins with yet unknown 3D structure.  相似文献   

3.
It has recently been discovered that globular proteins are universally built from standard loop-n-lock units of about 30 amino acid residues. The hypothesis has been put forward on the loop stage in the protein evolution when the units were autonomous. Later they joined together making longer chains. One would expect that the early individual loop-n-lock elements might still be detected in modern protein sequences as remnants of the hypothetical 30-residue sequence prototypes. Among several strong sequence motifs, extracted from protein sequences of 23 complete bacterial proteomes, one 32-residue prototype was studied here in detail. Numerous sequence segments related to the prototype are identified in the crystal structures of proteins of a PDB_SELECT database. Analysis of the respective chain trajectories for the cases with different degrees of sequence conservation confirms that the majority of the segments correspond to the closed loops. In the evolutionary diversification of the prototypes the secondary structure yields first, while the sequence is still moderately conserved. The last feature to go is the chain return property. Apparently, the opening of the loops would severely destabilize the protein fold, which explains their conservation.  相似文献   

4.
Universal scale of the sequence conservation has been recently introduced based on omnipresence of the protein sequence motifs across species. A large spectrum of short sequences, up to eight residues has been found to reside in all or almost all prokaryotic organisms. By this discovery a principally novel quantitative approach is introduced to the problem of reconstruction of the last universal common ancestor (LUCA). The most conserved elements (protein modules) with defined structures and sequences harboring the omnipresent motifs are outlined in this work, by combining the sequence and protein crystal structure data. The structurally conserved modules involve 25–30 amino acid residues and have appearance of closed loops, loop-n-lock structures. This confirms earlier conclusions on the loop-fold structure of globular proteins. Many of the topmost conserved modules represent the primary closed loop prototypes, that have been derived by whole genome sequence searches. The data presented, thus, make a basis for further developments toward the earliest stages of protein evolution. [Reviewing Editor: Dr. Martin Kreitman]  相似文献   

5.
Four basic stages of evolution of protein structure are described based on recent work of the authors targeted specifically on reconstruction of the earliest events in the protein evolution. According to this reconstruction, the initial stage of short peptides of, probably, only few amino-acid residues had been followed by formation of closed loops of the size 25-30 residues, which corresponds to the polymer-statistically optimal ring closure size for mixed polypeptide chains. The next stage involved fusion of the respective small linear genes and formation of protein structures consisting of several closed loops of the nearly standard size, up to 4-6 loops (100-200 amino acid residues) in a typical protein fold. The last, modern stage began with combinatorial fusion of the presumably circular 300-600 bp DNA units and, accordingly, formation of multidomain proteins.  相似文献   

6.
The closed loops within the proteins of the TIM-barrel fold family are analyzed and compared sequence- and structure-wise. The size distribution of the closed loops of the TIM-barrels confirms universal preference to the standard size of 25-30 residues. 3D structural RMSD comparisons of the closed loops and presentation of their sequences in binary form suggest that the TIM-barrel proteins are built from descendants of several types of basic closed loop prototypes. Comparison of these prototypes points to a likely common ancestor--the alpha helix containing closed loops of 28 amino acids. The presumed ancestor is characterized by specific binary consensus sequence.  相似文献   

7.
Four basic stages of evolution of protein structure are described, basing on recent work of the authors aimed specifically to reconstruct the earliest events in the protein evolution. According to this reconstruction, the initial stage of short peptides comprising, probably, only a few amino acid residues had been followed by formation of closed loops of 25–30 residues, which corresponds to the polymer-statistically optimal ring closure size for mixed polypeptide chains. The next stage involved fusion of relatively small linear genes and formation of protein structures consisting of several closed loops of a nearly standard size, with 4–6 loops (100–200 amino acid residues) in a typical protein fold. The last, modern stage began with combinatorial fusion of the presumably circular 300–600 bp DNA units and, accordingly, formation of multidomain proteins.  相似文献   

8.
Analysis of crystallized protein structures suggests that globular proteins are organized as consecutively connected units of 25-35 residues. These units are closed loops, that is returns of the polypeptide chain trajectory to a close contact with itself. This universal feature of apparently polymer-statistical nature is a basis for a principally novel view on the globular proteins as loop fold structures. The same unit size has been detected in protein sequences translated from complete prokaryotic genomes by positional autocorrelation analysis, which strongly indicates the evolutionary connection of the units. The units are further characterized by prototype sequences matching to their numerous derivatives in the translated genomes. The matches to five strongest prokaryotic prototypes and three prototypes of C. elegans are identified in the sequences of crystallized proteins, and their structures analyzed. Corresponding segments of the polypeptide chains in majority of cases form closed loops, though evolutionary fate of every prototype element is shown to be rather diverse. Then loop ends can be separated by a sequence-wise distant segments and stabilized by the spatial interactions in the context of the overall globular structure. The units belong to a presumably limited spectrum of the sequence prototypes, full repertoire of which would constitute a proteomic code.  相似文献   

9.
Abstract

The closed loops within the proteins of the TIM-barrel fold family are analyzed and compared sequence- and structure-wise. The size distribution of the closed loops of the TIM-barrels confirms universal preference to the standard size of 25–30 residues. 3D structural RMSD comparisons of the closed loops and presentation of their sequences in binary form suggest that the TIM-barrel proteins are built from descendants of several types of basic closed loop prototypes. Comparison of these prototypes points to a likely common ancestor—the alpha helix containing closed loops of 28 amino acids. The presumed ancestor is characterized by specific binary consensus sequence.  相似文献   

10.
Li W  Liu Z  Lai L 《Biopolymers》1999,49(6):481-495
A general problem in comparative modeling and protein design is the conformational evaluation of loops with a certain sequence in specific environmental protein frameworks. Loops of different sequences and structures on similar scaffolds are common in the Protein Data Bank (PDB). In order to explore both structural and sequential diversity of them, a data base of loops connecting similar secondary structure fragments is constructed by searching the data base of families of structurally similar proteins and PDB. A total of 84 loop families having 2-13 residues are found among the well-determined structures of resolution better than 2.5 A. Eight alpha-alpha, 20 alpha-beta, 19 beta-alpha, and 37 beta-beta families are identified. Every family contains more than 5 loop motifs. In each family, no loops share same sequence and all the frameworks are well superimposed. Forty-three new loop classes are distinguished in the data base. The structural variability of loops in homologous proteins are examined and shown in 44 families. Motif families are characterized with geometric parameters and sequence patterns. The conformations of loops in each family are clustered into subfamilies using average linkage cluster analysis method. Information such as geometric properties, sequence profile, sequential and structural variability in loop, structural alignment parameters, sequence similarities, and clustering results are provided. Correlations between the conformation of loops and loop sequence, motif sequence, and global sequence of PDB chain are examined in order to find how loop structures depend on their sequences and how they are affected by the local and global environment. Strong correlations (R > 0.75) are only found in 24 families. The best R value is 0.98. The data base is available through the Internet.  相似文献   

11.
Recent sequence analysis of complete prokaryotic proteomes suggests that in early evolutionary stages proteins were rather small, of the size 25-35 amino acids. Corroborating evidence comes from protein crystal data, which indicate this size for closed loops--universal structural units of globular proteins. In the latest development we were able to derive and structurally characterize several sequence/structure prototypes apparently representing early protein units. Structurally the prototypes appear as closed loops stabilized by end-to-end van der Waals interactions. While nearly standard in size the loops are highly diverse in terms of their secondary structure. A presentation of the protein as an assembly of descendants of the prototypes, the first of its kind, is described in detail here. The sequence and structure of the ATP-binding subunit of histidine permease of S. typhimurium is shown to contain several modified copies of different prototype elements, closed loops, and, thus, can be spelled as: x-PI-x-PIV-PVI-PII-PVII-x, where PI-PVII are the prototype elements. This study sets up the basic principles for the sequence/structure prototype spelling of globular proteins.  相似文献   

12.
Standard building blocks of proteins--closed loops of 25-30 amino acid residues--have been recently discovered and further characterized by combined efforts of several laboratories. New challenging views on the protein structure, folding, and evolution are introduced by these studies. In particular, the role of van der Waals contacts in protein stability is better understood. They can be considered as locks closing the polypeptide chain returns and forming the loop-n-lock elements. The linearity of the arrangement of the standard loops in the proteins has important evolutionary implications. Selection pressure to maintain the loops of nearly standard size is reflected in the protein sequences as characteristic distance between hydrophobic residues, equal to the loop end-to-end distance. Further characterization of the loop-n-lock units reveals several sequence/structure prototypes, which suggests a new basis for protein classification. The following is a review of these studies.  相似文献   

13.
14.
Phosphotyrosine hydrolysis by protein tyrosine phosphatases (PTPs) involves substrate binding by the PTP loop and closure over the active site by the WPD loop. The E loop, located immediately adjacent to the PTP and WPD loops, is conserved among human PTPs in both sequence and structure, yet the role of this loop in substrate binding and catalysis is comparatively unexplored. Hematopoietic PTP (HePTP) is a member of the kinase interaction motif (KIM) PTP family. Compared to other PTPs, KIM-PTPs have E loops that are unique in both sequence and structure. In order to understand the role of the E loop in the transition between the closed state and the open state of HePTP, we identified a novel crystal form of HePTP that allowed the closed-state-to-open-state transition to be observed within a single crystal form. These structures, which include the first structure of the HePTP open state, show that the WPD loop adopts an ‘atypically open’ conformation and, importantly, that ligands can be exchanged at the active site, which is critical for HePTP inhibitor development. These structures also show that tetrahedral oxyanions bind at a novel secondary site and function to coordinate the PTP, WPD, and E loops. Finally, using both structural and kinetic data, we reveal a novel role for E-loop residue Lys182 in enhancing HePTP catalytic activity through its interaction with Asp236 of the WPD loop, providing the first evidence for the coordinated dynamics of the WPD and E loops in the catalytic cycle, which, as we show, is relevant to multiple PTP families.  相似文献   

15.
The capsid proteins of the ADV-G isolate of Aleutian mink disease parvovirus (ADV) were expressed in 10 nonoverlapping segments as fusions with maltose-binding protein in pMAL-C2 (pVP1, pVP2a through pVP2i). The constructs were designed to capture the VP1 unique sequence and the portions analogous to the four variable surface loops of canine parvovirus (CPV) in individual fragments (pVP2b, pVP2d, pVP2e, and pVP2g, respectively). The panel of fusion proteins was immunoblotted with sera from mink infected with ADV. Seropositive mink infected with either ADV-TR, ADV-Utah, or ADV-Pullman reacted preferentially against certain segments, regardless of mink genotype or virus inoculum. The most consistently immunoreactive regions were pVP2g, pVP2e, and pVP2f, the segments that encompassed the analogs of CPV surface loops 3 and 4. The VP1 unique region was also consistently immunoreactive. These findings indicated that infected mink recognize linear epitopes that localized to certain regions of the capsid protein sequence. The segment containing the hypervariable region (pVP2d), corresponding to CPV loop 2, was also expressed from ADV-Utah. An anti-ADV-G monoclonal antibody and a rabbit anti-ADV-G capsid antibody reacted exclusively with the ADV-G pVP2d segment but not with the corresponding segment from ADV-Utah. Mink infected with ADV-TR or ADV-Utah also preferentially reacted with the pVP2d sequence characteristic of that virus. These results suggested that the loop 2 region may contain a type-specific linear epitope and that the epitope may also be specifically recognized by infected mink. Heterologous antisera were prepared against the VP1 unique region and the four segments capturing the variable surface loops of CPV. The antisera against the proteins containing loop 3 or loop 4, as well as the anticapsid antibody, neutralized ADV-G infectivity in vitro and bound to capsids in immune electron microscopy. These results suggested that regions of the ADV capsid proteins corresponding to surface loops 3 and 4 of CPV contain linear epitopes that are located on the external surface of the ADV capsid. Furthermore, these linear epitopes contain neutralizing determinants. Computer comparisons with the CPV crystal structure suggest that these sequences may be adjacent to the threefold axis of symmetry of the viral particle.  相似文献   

16.
Methods for the prediction of protein function from structure are of growing importance in the age of structural genomics. Here, we focus on the problem of identifying sites of potential serine protease inhibitor interactions on the surface of proteins of known structure. Given that there is no sequence conservation within canonical loops from different inhibitor families, we first compare representative loops to all fragments of equal length among proteins of known structure by calculating main-chain RMS deviation. Fragments with RMS deviation below a certain threshold (hits) are removed if residues have solvent accessibilities appreciably lower than those observed in the search structure. These remaining hits are further filtered to remove those occurring largely within secondary structure elements. Likely functional significance is restricted further by considering only extracellular protein domains. By comparing different canonical loop structures to the protein structure database, we show that the method is able to detect previously known inhibitors. In addition, we discuss potentially new canonical loop structures found in secreted hydrolases, toxins, viral proteins, cytokines and other proteins. We discuss the possible functional significance of several of the examples found, and comment on implications for the prediction of function from protein 3D structure.  相似文献   

17.
Fibronectin type III (FN-III) domains are autonomously folded modules found in a variety of multidomain proteins. The 10th FN-III domain from fibronectin (fnFN10) and the 3rd FN-III domain from tenascin-C (tnFN3) have 27% sequence identity and the same overall fold; however, the CC' loop has a different pattern of backbone hydrogen bonds and the FG loop is longer in fnFN10 compared to tnFN3. To examine the influence of length, sequence, and context in determining dynamical properties of loops, CC' and FG loops were swapped between fnFN10 and tnFN3 to generate four mutant proteins and backbone conformational dynamics on ps-ns and mus-ms timescales were characterized by solution (15)N-NMR spin relaxation spectroscopy. The grafted loops do not strongly perturb the properties of the protein scaffold; however, specific effects of the mutations are observed for amino acids that are proximal in space to the sites of mutation. The amino acid sequence primarily dictates conformational dynamics when the wild-type and grafted loop have the same length, but both sequence and context contribute to conformational dynamics when the loop lengths differ. The results suggest that changes in conformational dynamics of mutant proteins must be considered in both theoretical studies and protein design efforts.  相似文献   

18.
Protein structure can be viewed as a compact linear array of nearly standard size closed loops of 25-30 amino acid residues (Berezovsky et al., FEBS Letters 2000; 466: 283-286) irrespective of details of secondary structure. The end-to-end contacts in the loops are likely to be hydrophobic, which is a testable hypothesis. This notion could be verified by direct comparison of the loop maps with Kyte and Doolittle hydropathicity plots. This analysis reveals that most of the ends of the loops are hydrophobic, indeed. The same conclusion is reached on the basis of positional autocorrelation analysis of protein sequences of 23 fully sequenced bacterial genomes. Hydrophobic residues valine, alanine, glycine, leucine, and isoleucine appear preferentially at the 25-30 residues distance one from another. These observations open a new perspective in the understanding of protein structure and folding: a consecutive looping of the polypeptide chain with the loops ending primarily at hydrophobic nuclei.  相似文献   

19.
Crasto CJ  Feng J 《Proteins》2001,42(3):399-413
We performed an extensive sequence analysis on the loops of proteins. By dividing a loop databank derived from the Protein Data Bank into groups, we analyzed the chemical characteristics and the sequence preferences of loops of different lengths and loops connecting different secondary structures in proteins. We found that a large population of loops in our loop databank (94.4%) is either partially or completely surface-exposed. A majority of surface loops in proteins are hydrophilic, whereas the chemical characteristics of interior loops are relatively neutral according to Eisenberg's consensus hydrophobicity scale. As a first step in investigating the intrinsic sequence-structure relationship of loop sequences in proteins, we performed a neighbor-dependent sequence analysis that calculated the effect of the neighboring amino acid type on the loop propensity of residues in loops. This method enhances the statistical significance of residue propensity, thus allowing us to explore the positional preference of amino acids in loops. Our analysis yielded a series of amino acid dyads that showed high preference for loop conformation. The data presented in this study should prove useful for developing potential codes in recognizing loop sequences in proteins.  相似文献   

20.
St-Pierre JF  Mousseau N 《Proteins》2012,80(7):1883-1894
We present an adaptation of the ART-nouveau energy surface sampling method to the problem of loop structure prediction. This method, previously used to study protein folding pathways and peptide aggregation, is well suited to the problem of sampling the conformation space of large loops by targeting probable folding pathways instead of sampling exhaustively that space. The number of sampled conformations needed by ART nouveau to find the global energy minimum for a loop was found to scale linearly with the sequence length of the loop for loops between 8 and about 20 amino acids. Considering the linear scaling dependence of the computation cost on the loop sequence length for sampling new conformations, we estimate the total computational cost of sampling larger loops to scale quadratically compared to the exponential scaling of exhaustive search methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号