首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The modular structures of extant proteins and genes suggest that modern genes developed hierarchically from combinatorial assemblages of smaller primordial genetic units (microgenes). The MolCraft system described in this review is the new type of in vitro protein evolution system whose underlying concept is the hierarchical evolution of genes. In MolCraft, a microgene is initially evolved in silico and then tandemly polymerized with insertion or deletion mutations at the junctions between microgene units. Because of the junctional perturbations, proteins translated from a single microgene polymer are molecularly diverse, originating from the combinatorics of three reading frames, and are thus combinatorial polymers of three peptides. Notably, repetitiousness retained in the overall structure of proteins contributes to the formation of ordered structures, and enhances the chances of reconstituting biological activity rationally encrypted in the microgene unit. Applications of this new technology are discussed.  相似文献   

2.
We created artificial proteins that contained repeats of a short peptide motif, Asn-Gly-Asx. In nature this motif is repeated within shell proteins as an idiosyncratic domain, while in vitro it has been shown to suppress calcification. The motif was embedded within peptide sequences that did or did not have the ability to form secondary structures, which provided the motif with a variety of physicochemical properties. Although a short synthetic peptide containing the motif did not inhibit calcification in vitro, some of the artificial proteins carrying repeats of the motif did show robust suppression of calcification. Artificial proteins lacking the motif did not exhibit suppressive activity. Likewise, one construct containing multiple repeats of the motifs also did not exert an inhibitory effect on calcification. Apparently, carrying the Asn-Gly-Asx motif is not, by itself, sufficient for expression of its cryptic activity; instead, certain physicochemical properties of the polypeptides mediate its manifestation. We anticipate that syntheses using "motif programming", such as the one described here, will shed light on the origin of repetitive sequences as well as on the evolution of biomineralization proteins.  相似文献   

3.
Internal repetition within proteins has been a successful strategem on multiple separate occasions throughout evolution. Such protein repeats possess regular secondary structures and form multirepeat assemblies in three dimensions of diverse sizes and functions. In general, however, internal repetition affords a protein enhanced evolutionary prospects due to an enlargement of its available binding surface area. Constraints on sequence conservation appear to be relatively lax, due to binding functions ensuing from multiple, rather than, single repeats. Considerable sequence divergence as well as the short lengths of sequence repeats mean that repeat detection can be a particularly arduous task. We also consider the conundrum of how multiple repeats, which show strong structural and functional interdependencies, ever evolved from a single repeat ancestor. In this review, we illustrate each of these points by referring to six prolific repeat types (repeats in beta-propellers and beta-trefoils and tetratricopeptide, ankyrin, armadillo/HEAT, and leucine-rich repeats) and in other less-prolific but nonetheless interesting repeats.  相似文献   

4.
The armadillo domain is a right‐handed super‐helix of repeating units composed of three α‐helices each. Armadillo repeat proteins (ArmRPs) are frequently involved in protein–protein interactions, and because of their modular recognition of extended peptide regions they can serve as templates for the design of artificial peptide binding scaffolds. On the basis of sequential and structural analyses, different consensus‐designed ArmRPs were synthesized and show high thermodynamic stabilities, compared to naturally occurring ArmRPs. We determined the crystal structures of four full‐consensus ArmRPs with three or four identical internal repeats and two different designs for the N‐ and C‐caps. The crystal structures were refined at resolutions ranging from 1.80 to 2.50 Å for the above mentioned designs. A redesign of our initial caps was required to obtain well diffracting crystals. However, the structures with the redesigned caps caused domain swapping events between the N‐caps. To prevent this domain swap, 9 and 6 point mutations were introduced in the N‐ and C‐caps, respectively. Structural and biophysical analysis showed that this subsequent redesign of the N‐cap prevented domain swapping and improved the thermodynamic stability of the proteins. We systematically investigated the best cap combinations. We conclude that designed ArmRPs with optimized caps are intrinsically stable and well‐expressed monomeric proteins and that the high‐resolution structures provide excellent structural templates for the continuation of the design of sequence‐specific modular peptide recognition units based on armadillo repeats.  相似文献   

5.
The biologically active state of many proteins requires their prior homo-oligomerisation. Such complexes are typically symmetrical, a feature that has been proposed to increase their stability and facilitate the evolution of allosteric regulation. We wished to examine the possibility that similar structures and properties could arise from genetic amplifications leading to internal symmetrical repeats. For this, we identified internal structural repeats in a nonredundant Protein Data Bank subset. While testing if repeats in proteins tend to be symmetrical, we found that about half of the large internal repeats are symmetrical, most frequently around a rotation axis of 180°. These repeats were most likely created by genetic amplification processes because they show significant sequence similarity. Symmetrical repeats tend to have a fixed number of copies corresponding to their rotational symmetry order, that is, two for 180° rotation axis, whereas asymmetrical repeats are in longer proteins and show copy number variability. When possible, we confirmed that proteins with symmetrical repeats folding as an n-mer have homologues lacking the repeat with a higher oligomerisation number corresponding to the rotation symmetry order of the repeat. Phylogenetic analyses of these protein families suggest that typically, but not always, symmetrical repeats arise in one single event from proteins that are homo-oligomers. These results suggest that oligomerisation and amplification of internal sequences can interplay in evolutionary terms because they result in functional analogues when the latter exhibit rotational symmetry.  相似文献   

6.
Repeat proteins have a modular organization and a regular architecture that make them attractive models for design and directed evolution experiments. HEAT repeat proteins, although very common, have not been used as a scaffold for artificial proteins, probably because they are made of long and irregular repeats. Here, we present and validate a consensus sequence for artificial HEAT repeat proteins. The sequence was defined from the structure-based sequence analysis of a thermostable HEAT-like repeat protein. Appropriate sequences were identified for the N- and C-caps. A library of genes coding for artificial proteins based on this sequence design, named αRep, was assembled using new and versatile methodology based on circular amplification. Proteins picked randomly from this library are expressed as soluble proteins. The biophysical properties of proteins with different numbers of repeats and different combinations of side chains in hypervariable positions were characterized. Circular dichroism and differential scanning calorimetry experiments showed that all these proteins are folded cooperatively and are very stable (Tm > 70 °C). Stability of these proteins increases with the number of repeats. Detailed gel filtration and small-angle X-ray scattering studies showed that the purified proteins form either monomers or dimers. The X-ray structure of a stable dimeric variant structure was solved. The protein is folded with a highly regular topology and the repeat structure is organized, as expected, as pairs of alpha helices. In this protein variant, the dimerization interface results directly from the variable surface enriched in aromatic residues located in the randomized positions of the repeats. The dimer was crystallized both in an apo and in a PEG-bound form, revealing a very well defined binding crevice and some structure flexibility at the interface. This fortuitous binding site could later prove to be a useful binding site for other low molecular mass partners.  相似文献   

7.
There are several different families of repeat proteins. In each, a distinct structural motif is repeated in tandem to generate an elongated structure. The nonglobular, extended structures that result are particularly well suited to present a large surface area and to function as interaction domains. Many repeat proteins have been demonstrated experimentally to fold and function as independent domains. In tetratricopeptide (TPR) repeats, the repeat unit is a helix-turn-helix motif. The majority of TPR motifs occur as three to over 12 tandem repeats in different proteins. The majority of TPR structures in the Protein Data Bank are of isolated domains. Here we present the high-resolution structure of NlpI, the first structure of a complete TPR-containing protein. We show that in this instance the TPR motifs do not fold and function as an independent domain, but are fully integrated into the three-dimensional structure of a globular protein. The NlpI structure is also the first TPR structure from a prokaryote. It is of particular interest because it is a membrane-associated protein, and mutations in it alter septation and virulence.  相似文献   

8.
Many proteins, especially in eukaryotes, contain tandem repeats of several domains from the same family. These repeats have a variety of binding properties and are involved in protein–protein interactions as well as binding to other ligands such as DNA and RNA. The rapid expansion of protein domain repeats is assumed to have evolved through internal tandem duplications. However, the exact mechanisms behind these tandem duplications are not well-understood. Here, we have studied the evolution, function, protein structure, gene structure, and phylogenetic distribution of domain repeats. For this purpose we have assigned Pfam-A domain families to 24 proteomes with more sensitive domain assignments in the repeat regions. These assignments confirmed previous findings that eukaryotes, and in particular vertebrates, contain a much higher fraction of proteins with repeats compared with prokaryotes. The internal sequence similarity in each protein revealed that the domain repeats are often expanded through duplications of several domains at a time, while the duplication of one domain is less common. Many of the repeats appear to have been duplicated in the middle of the repeat region. This is in strong contrast to the evolution of other proteins that mainly works through additions of single domains at either terminus. Further, we found that some domain families show distinct duplication patterns, e.g., nebulin domains have mainly been expanded with a unit of seven domains at a time, while duplications of other domain families involve varying numbers of domains. Finally, no common mechanism for the expansion of all repeats could be detected. We found that the duplication patterns show no dependence on the size of the domains. Further, repeat expansion in some families can possibly be explained by shuffling of exons. However, exon shuffling could not have created all repeats.  相似文献   

9.
A census of protein repeats.   总被引:20,自引:0,他引:20  
In this study, we analyzed all known protein sequences for repeating amino acid segments. Although duplicated sequence segments occur in 14 % of all proteins, eukaryotic proteins are three times more likely to have internal repeats than prokaryotic proteins. After clustering the repetitive sequence segments into families, we find repeats from eukaryotic proteins have little similarity with prokaryotic repeats, suggesting most repeats arose after the prokaryotic and eukaryotic lineages diverged. Consequently, protein classes with the highest incidence of repetitive sequences perform functions unique to eukaryotes. The frequency distribution of the repeating units shows only weak length dependence, implicating recombination rather than duplex melting or DNA hairpin formation as the limiting mechanism underlying repeat formation. The mechanism favors additional repeats once an initial duplication has been incorporated. Finally, we show that repetitive sequences are favored that contain small and relatively water-soluble residues. We propose that error-prone repeat expansion allows repetitive proteins to evolve more quickly than non-repeat-containing proteins.  相似文献   

10.
By controlling the growth of inorganic crystals, macro-biomolecules, including proteins, play pivotal roles in modulating biomineralization. Natural proteins that promote biomineralization are often composed of simple repeats of peptide sequences; however, the relationship between these repetitive structures and their functions remains largely unknown. Here we show that an artificial protein containing a repeated peptide sequence allows NaCl, KCl, CuSO4 and sucrose to form a variety of macroscopic structures, as represented by their dendritic configurations. Mutational analyses revealed that the physicochemical characteristics of the protein, not the peptide sequence per se, were responsible for formation of the dendritic structures. This suggests that proteins that modulate crystal growth may have evolved as repeat-containing forms at a relatively high rate. These observations could serve as the basis for developing new genetic programming systems for creation of artificial proteins able to modulate crystal growth from inorganic compounds, and may thus provide a new tool for nano-biotechnology.  相似文献   

11.
The Echinococcus granulosus actin filament-fragmenting protein (EgAFFP) is a three domain member of the gelsolin family of proteins, which is antigenic to human hosts. These proteins, formed by three or six conserved domains, are involved in the dynamic rearrangements of the cytoskeleton, being responsible for severing and capping actin filaments and promoting nucleation of actin monomers. Various structures of six domain gelsolin-related proteins have been investigated, but little information on the structure of three domain members is available. In this work, the solution structure of the three domain EgAFFP has been investigated through small-angle x-ray scattering (SAXS) studies. EgAFFP exhibits an elongated molecular shape. The radius of gyration and the maximum dimension obtained by SAXS were, respectively, 2.52 +/- 0.01 nm and 8.00 +/- 1.00 nm, both in the absence and presence of Ca2+. Two different molecular homology models were built for EgAFFP, but only one was validated through SAXS studies. The predicted structure for EgAFFP consists of three repeats of a central beta-sheet sandwiched between one short and one long alpha-helix. Possible implications of the structure of EgAFFP upon actin binding are discussed.  相似文献   

12.
Fibrous proteins found in natural materials such as silk fibroins, spider silks, and viral spikes increasingly serve as a source of inspiration for the design of novel, artificial fibrous materials. The fiber protein from the adenovirus has previously served as a model for the design of artificial, self-assembling fibers. The fibrous shaft of this protein consists of 15-amino-acid sequence repeats that fold into a triple β-spiral motif in their native context. Recombinant proteins based on multimers of simplified consensus shaft repeats were previously reported to form self-assembling fibrils from which filaments could be spun. Here, we describe the structural characterization of these fibrils; X-ray fiber diffraction, Raman spectroscopy, and Congo Red binding strongly suggest an amyloid-type structure for these fibrils, with β-strands arranged perpendicular to the fibril axis. This amyloid structure is distinct from the native β-spiral fold, and similar to amyloid structures formed by short, synthetic peptides corresponding to shaft sequences. We discuss implications for the rational design of novel fibrous materials, based on crystal structure information and knowledge of folding and assembly pathways of natural fibrous proteins.  相似文献   

13.
Internal repeats in protein sequences have wide-ranging implications for the structure and function of proteins. A keen analysis of the repeats in protein sequences may help us to better understand the structural organization of proteins and their evolutionary relations. In this paper, a mathematical method for searching for latent periodicity in protein sequences is developed. Using this method, we identified simple sequence repeats in the alkaline proteases and found that the sequences could show the same periodicity as their tertiary structures. This result may help us to reduce difficulties in the study of the relationship between sequences and their structures.  相似文献   

14.
Creating artificial protein families affords new opportunities to explore the determinants of structure and biological function free from many of the constraints of natural selection. We have created an artificial family comprising ˜3,000 P450 heme proteins that correctly fold and incorporate a heme cofactor by recombining three cytochromes P450 at seven crossover locations chosen to minimize structural disruption. Members of this protein family differ from any known sequence at an average of 72 and by as many as 109 amino acids. Most (>73%) of the properly folded chimeric P450 heme proteins are catalytically active peroxygenases; some are more thermostable than the parent proteins. A multiple sequence alignment of 955 chimeras, including both folded and not, is a valuable resource for sequence-structure-function studies. Logistic regression analysis of the multiple sequence alignment identifies key structural contributions to cytochrome P450 heme incorporation and peroxygenase activity and suggests possible structural differences between parents CYP102A1 and CYP102A2.  相似文献   

15.
Stevens TJ  Paoli M 《Proteins》2008,70(2):378-387
The beta-propeller fold is a phylogenetically widespread, common protein architecture able to support a range of different functions such as catalysis, ligand binding and transport, regulation and protein binding. Interestingly, it appears that the beta-propeller topology is also compatible with strikingly diverse sequences. Amongst this diversity, there are three large groups of proteins with related sequences and very important cellular and intercellular regulatory functions: WD, kelch, and YWTD proteins. A common characteristic between these protein families is that their sequences, while distinct, all contain internal repeats 40-45 residues long. Through a pangenomic analysis using internal repeat profiles derived from the structurally known propeller modules of the eukaryotic protein RCC1 and the related prokaryotic protein BLIP-II, we have defined a new superfamily of propeller repeats, the RCC1-like repeats (RLRs). These sequences turn out to be more phylogenetically widespread than other large groups of propeller proteins, occurring in both prokaryotic and eukaryotic genomes. Interestingly, our research showed that RLR domains with different numbers of repeats exist, ranging from 3 to 7, and possibly more. A novel, intriguing finding is the discovery of sequences with 3 repeats, as well as proteins with 10 modular units, though in the latter case it is not clear whether these are made of two 5-bladed domains or a single, novel 10-bladed propeller. In addition, the results indicate that circular permutation events may have taken place in the evolution of these proteins. It is now established that the group of RLR proteins is extremely numerous and is characterized by unique, remarkable features which place it in a position of special interest as an important superfamily of proteins in nature.  相似文献   

16.
The ciliated protozoa exhibit nuclear dimorphism. The genome of the somatic macronucleus arises from the germ-line genome of the micronucleus following conjugation. We have studied the fates of highly repetitious sequences in this process. Two cloned, tandemly repeated sequences from the micronucleus of Oxytricha fallax were used as probes in hybridizations to micronuclear and macronuclear DNA. The results of these experiments show: (1) the cloned repeats are members of two apparently unrelated repetitious sequence families, which each appear to comprise a few percent of the micronuclear genome, and (2) the amount of either family in the macronuclei from which our DNA was prepared is about 1/15 that found in an equal number of diploid micronuclei. Most, if not all, of the apparent macronuclear copies of these repeats can be accounted for by micronuclear contamination, which strongly suggests that these sequences are eliminated from the macronuclei and have no vegetiative function.  相似文献   

17.
18.
Terminal deletions of units from α‐helical repeat proteins have provided insight into the physical origins of their cooperativity. To test if the same principles governing cooperativity apply to β‐sheet‐containing repeat proteins, we have created a series of C‐terminal deletion constructs from a large leucine‐rich repeat (LRR) protein, YopM. We have examined the structure and stability of the resulting deletion constructs by a combination of solution spectroscopy, equilibrium denaturation studies, and limited proteolysis. Surprisingly, a high degree of nonuniformity was found in the stability distribution of YopM. Unlike previously studied repeat proteins, we identified several key LRR that on deletion disrupt nearby structure, at distances as far away as up to three repeats, in YopM. This partial unfolding model is supported by limited proteolysis studies and by point substitution in repeats predicted to be disordered as a result of deletion of adjacent repeats. We show that key internal‐ and terminal‐caps must be present to maintain the structural integrity in adjacent regions (roughly four LRRs long) of decreased stability. The finding that full‐length YopM maintains a high level of cooperativity in equilibrium unfolding underscores the importance of interfacial interactions in stabilizing locally unstable regions of structure.  相似文献   

19.
The ability to design specific amino acid sequences that fold into desired structures is central to engineering novel proteins. Protein design is also a good method to assess our understanding of sequence-structure and structure-function relationships. While beta-sheet structures are important elements of protein architecture, it has traditionally been more difficult to design beta-proteins than alpha-helical proteins. Taking advantage of the tandem repeated sequences that form the structural building blocks in a group of beta-propeller proteins; we have used a consensus design approach to engineer modular and relatively large scaffolds. An idealized WD repeat was designed from a structure-based sequence alignment with a set of structural guidelines. Using a plasmid sequential ligation strategy, artificial concatemeric genes with up to 10 copies of this idealized repeat were then constructed. Corresponding proteins with 4 through to 10 WD repeats were soluble when over-expressed in Escherichia coli. Notably, they were sufficiently stable in vivo surviving attack from endogenous proteases, and maintained a homogeneous, non-aggregated form in vitro. The results show that the beta-propeller scaffold is an attractive platform for future engineering work, particularly in experiments in which directed evolution techniques might improve the stability of the molecules and/or tailor them for a specific function.  相似文献   

20.
Summary Three outstanding properties uniquely qualify repeats of base oligomers as the primordial coding sequences of all polypeptide chains. First, when compared with randomly generated base sequences in general, they are more likely to have long open reading frames. Second, periodical polypeptide chains specified by such repeats are more likely to assume either -helical or -sheet secondary structures than are polypeptide chains of random sequence. Third, provided that the number of bases in the oligomeric unit is not a multiple of 3, these internally repetitious coding sequences are impervious to randomly sustained base substitutions, deletions, and insertions. This is because the recurring periodicity of their polypeptide chains is given by three consecutive copies of the oligomeric unit translated in three different reading frames. Accordingly, when one reading frame is open, the other two are automatically open as well, all three being capable of coding for polypeptide chains of identical periodicity. Under this circumstance, a frame shift due to the deletion or insertion of a number of bases that is not a multiple of 3 fails to alter the downstream amino acid sequence, and even a base change causing premature chain-termination can silence only one of the three potential coding units.Newly arisen coding sequences in modern organisms are oligomeric repeats, and most of the older genes retain various vestiges of their original internal repetitions. Some of the genes (e.g., oncogenes) have even inherited the property of being impervious to randomly sustained base changes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号