首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
We have identified four repeats and five domains that are novel in proteins encoded by the Pyrobaculum aerophilum str. IM2 proteome using automated in silico methods. A "repeat" corresponds to a region comprising less than 55 amino acid residues that occurs more than once in the protein sequence and sometimes present in tandem. A "domain" corresponds to a conserved region comprising greater than 55 amino acid residues and may be present as single or multiple copies in the protein sequence. These correspond to (1) 85 amino acid residues AAG domain, (2) 72 amino acid residues GFGN domain, (3) 43 amino acid residues KGG repeat, (4) 25 amino acid residues RWE repeat, (5) 25 amino acid residues RID repeat, (6) 108 amino acid residues NDFA domain, (7) 140 amino acid residues VxY domain, (8) 35 amino acid residues LLPN repeat and (9) 98 amino acid residues GxY domain. A repeat or domain is characterized by specific conserved sequence motifs. We discuss the presence of these repeats and domains in proteins from other genomes and their probable secondary structure.  相似文献   

2.
We have identified four novel repeats and two domains in cell surface proteins encoded by the Methanosarcina acetivorans genome and in some archaeal and bacterial genomes. The repeats correspond to a certain number of amino acid residues present in tandem in a protein sequence and each repeat is characterized by conserved sequence motifs. These correspond to: (a) a 42 amino acid (aa) residue RIVW repeat; (b) a 45 aa residue LGxL repeat; (c) a 42 aa residue LVIVD repeat; and (d) a 54 aa residue LGFP repeat. The domains correspond to a certain number of aa residues in a protein sequence that do not comprise internal repeats. These correspond to: (a) a 200 aa residue DNRLRE domain; and (b) a 70 aa residue PEGA domain. We discuss the occurrence of these repeats and domains in the different proteins and genomes analysed in this work.  相似文献   

3.
We have determined the complete cDNA sequence of rat plectin from a number of well-characterized overlapping lambda gt11 clones. The 4,140-residue predicted amino acid sequence (466,481 D) is consistent with a three-domain structural model in which a long central rod domain, having mainly an alpha-helical coiled coil conformation, is flanked by globular NH2- and COOH-terminal domains. The plectin sequence has a number of repeating motifs. The rod domain has five subregions approximately 200-residues long in which there is a strong repeat in the charged amino acids at 10.4 residues that may be involved in association between plectin molecules. The globular COOH-terminal domain has a prominent six-fold tandem repeat, with each repeat having a strongly conserved central region based on nine tandem repeats of a 19-residue motif. The plectin sequence has several marked similarities to that of desmoplakin (Green, K. J., D. A. D. Parry, P. M. Steinert, M. L. A. Virata, R. M. Wagner, B. D. Angst, and L.A. Nilles. 1990. J. Biol. Chem. 265:2,603-2,612), which has a shorter coiled-coil rod domain with a similar 10.4 residue charge periodicity and a COOH-terminal globular domain with three tandem repeats homologous to the six found in plectin. The plectin sequence also has homologies to that of the bullous pemphigoid antigen. Northern blot analysis indicated that there is a significant degree of conservation of plectin genes between rat, human, and chicken and that, as shown previously at the protein level, plectin has a wide tissue distribution. There appeared to be a single rat plectin gene that gave rise to a 15-kb message. Expression of polypeptides encoded by defined fragments of plectin cDNA in E. coli has also been used to localize the epitopes of a range of monoclonal and serum antibodies. This enabled us to tentatively map a sequence involved in plectin-vimentin and plectin-lamin B interactions to a restricted region of the rod domain.  相似文献   

4.
Mishima M  Shida T  Yabuki K  Kato K  Sekiguchi J  Kojima C 《Biochemistry》2005,44(30):10153-10163
Bacillus subtilis CwlC is a cell wall lytic N-acetylmuramoyl-l-alanine amidase that plays an important role in mother-cell lysis during sporulation. The enzyme consists of an N-terminal catalytic domain with C-terminal tandem repeats. The repeats [repeat 1 (residues 184-219) and repeat 2 (residues 220-255)] are termed CwlCr. We report on the solution structure of CwlCr as determined by multidimensional NMR, including the use of 36 (h3)J(NC)'-derived hydrogen bond restraints and 64 residual (1)D(NH) dipolar couplings. Two tandem repeats fold into a pseudo-2-fold symmetric single-domain structure consisting of a betaalphabetabetaalphabeta-fold containing numerous contacts between the repeats. Hydrophobic residues important for structural integrity are conserved between the repeats, and are located symmetrically. We also present NMR analysis of the circularly permuted repeat mutant of CwlCr. Secondary structure content from the chemical shifts and hydrogen bonds derived from (h3)J(NC)' show that the mutant folds into a structure similar to that of the wild type, suggesting that the repeats are exchangeable. This implies that conserved hydrophobic residues are crucial for maintaining the folding of the repeats. While monitoring the chemical shift perturbations following the addition of digested soluble peptidoglycan fragments, we identified two peptidoglycan interaction sites of CwlCr at the edges of the protein symmetrically, and they are located approximately 28 A from each other.  相似文献   

5.
Venturia inaequalis is a hemi-biotrophic fungus that causes scab disease of apple. A recently-identified gene from this fungus, cin1 (cellophane-induced 1), is up-regulated over 1000-fold in planta and considerably on cellophane membranes, and encodes a cysteine-rich secreted protein of 523 residues with eight imperfect tandem repeats of ~60 amino acids. The Cin1 sequence has no homology to known proteins and appears to be genus-specific; however, Cin1 repeats and other repeat domains may be structurally similar. An NMR-derived structure of the first two repeat domains of Cin1 (Cin1-D1D2) and a low-resolution model of the full-length protein (Cin1-FL) using SAXS data were determined. The structure of Cin1-D1D2 reveals that each domain comprises a core helix-loop-helix (HLH) motif as part of a three-helix bundle, and is stabilized by two intra-domain disulfide bonds. Cin1-D1D2 adopts a unique protein fold as DALI and PDBeFOLD analysis identified no structural homology. A (15)N backbone NMR dynamic analysis of Cin1-D1D2 showed that a short stretch of the inter-domain linker has large amplitude motions that give rise to reciprocal domain-domain mobility. This observation was supported by SAXS data modeling, where the scattering length density envelope remains thick at the domain-domain boundary, indicative of inter-domain dynamics. Cin1-FL SAXS data models a loosely-packed arrangement of domains, rather than the canonical parallel packing of adjacent HLH repeats observed in α-solenoid repeat proteins. Together, these data suggest that the repeat domains of Cin1 display a "beads-on-a-string" organization with inherent inter-domain flexibility that is likely to facilitate interactions with target ligands.  相似文献   

6.
7.
J Janatova  K B Reid  A C Willis 《Biochemistry》1989,28(11):4754-4761
Several plasma and membrane proteins belong to a superfamily of structurally related proteins that contain internal homology of a variable number (2-30) of repeating units. Each SCR (short consensus repeat) unit is approximately 60 amino acid residues in length, with the positions of 1 Trp, 2 Pro, and 4 Cys residues being conserved. The aim of this study was to provide experimental evidence that each SCR may exist as an independent structural domain maintained by disulfide bonds. The well-characterized C4b-binding protein (C4BP) with eight SCR units in each of its seven identical chains was chosen for this study. Analysis of the disulfide-bonding pattern indicated that intrachain disulfide bonds may be localized within each SCR unit, with the first and third and the second and fourth half-cystines in each unit being linked. This pattern of disulfides may confer to C4BP (and to other structurally related proteins) a conformation which apparently allows the assembly of the SCR units (4-30) in a tandem fashion. Such an arrangement of the polypeptide chain(s) may explain, in part, the elongated shape of these protein molecules. The structural motif of the SCR units of C4BP is discussed in relation to those previously described for the type II domain of fibronectin and the kringle structure present in various proteins of the coagulation system.  相似文献   

8.
The complete cDNA and polypeptide sequences of human erythroid alpha-spectrin.   总被引:11,自引:0,他引:11  
Overlapping human erythroid alpha-spectrin cDNA clones were isolated from lambda gt11 libraries constructed from cDNAs of human fetal liver and erythroid bone marrow. The composite 8001-base pair (bp) cDNA nucleotide sequence contains 187-bp 5'- and 528-bp 3'-untranslated regions and has a single long open reading frame of 7287 bp that encodes a polypeptide of 2429 residues. As previously described (Speicher, D. W., and Marchesi, V. T. (1984) Nature 311, 177-180), spectrin is composed largely of homologous 106-amino acid repeat units. From the amino acid sequence deduced from the cDNA, alpha-spectrin can be divided into 22 segments. Segments 1-9 and 12-19 are homologous and can therefore be considered repeats; the average number of identical residues in pairwise comparisons of these repeats is 22 out of 106, or 21%. Of these 17 repeats, 11 are exactly 106 amino acids in length, whereas five others differ from this length by a single residue. Segments 11, 20, and 21, although less homologous, appear to be related to the more highly conserved repeat units. The very N-terminal 22 residues, segment 10, which is atypical both in length and sequence, and the C-terminal 150 residues in segment 22 appear to be unrelated to the conserved repeat units. The sequence of the erythroid alpha-spectrin polypeptide chain is compared to that of human alpha-fodrin and chicken alpha-actinin to which it is related. alpha-Spectrin is more distantly related to dystrophin.  相似文献   

9.
Carbamoyl-phosphate synthetase (CPS) from Escherichia coli is a heterodimeric protein. The larger of the two subunits (M(r) approximately 118,000) contains a pair of homologous domains of approximately 400 residues each that are approximately 40% identical in amino acid sequence. The carboxy phosphate (residues 1-400) and carbamoyl phosphate domains (residues 553-933) also contain approximately 79 differentially conserved residues. These are residues that are conserved throughout the bacterial evolution of CPS in one of these homologous domains but not the other. The role of these differentially conserved residues in the structural and catalytic properties of CPS was addressed by swapping segments of these residues from one domain to the other. Nine of these chimeric mutant enzymes were constructed, expressed, purified, and characterized. A majority of the mutants were unable to synthesize any carbamoyl phosphate and the rest were severely crippled. True tandem repeat chimeric proteins were constructed by the complete substitution of one homologous domain sequence for the other. Neither of the two possible chimeric proteins was structurally stable. These results have been interpreted to demonstrate that the two homologous domains in the large subunit of CPS are functionally and structurally nonequivalent. This nonequivalence is a direct result of the specific functions each of these domains must perform during the overall synthesis of carbamoyl phosphate in the wild type enzyme and the specific structural alterations imposed by the differentially conserved residues.  相似文献   

10.
The complete primary structure of MSP-1, a major water-soluble glycoprotein in the foliated calcite shell layer of the scallop Patinopecten yessoensis, is reported. The full-length complementary DNA for MSP-1 isolated by polymerase chain reaction contained a sequence for a signal peptide of 20 amino acids followed by a polypeptide of 820 amino acids with calculated molecular mass of 74.5 kDa. The deduced amino acid sequence of MSP-1 includes a high proportion of Ser (32%), Gly (25%), and Asp (20%), and the predicted isoelectric point is 3.2; in these respects, MSP-1 is a typical acidic glycoprotein of mineralized tissues. A repeated modular structure characterizes MSP-1, with a sequence unit between 158 and 177 amino acids in length being repeated 4 times in tandem in the middle part of the protein. The repeated unit comprises 3 modules (SG, D, and K domains), each having a distinct amino acid composition and sequence. The SG domain is almost exclusively composed of Ser and Gly residues. The D domain is rich in Asp residues, potential N-glycosylation and phosphorylation sites. The K domain is rich in Gly residues and has a core of basic residues. The Asp residues are arranged more or less regularly in the D domains, exhibiting some repeated motifs such as Asp-Gly-Ser-Asp and Asp-Ser-Asp. Further, the 4 D domains indicate remarkable overall sequence similarities to each other. These observations suggest that the regular arrangements of COO groups in the D domain side chains may be important for specific control of crystal growth. Received September 19, 2000; accepted February 9, 2001  相似文献   

11.
The precursor of pulmonary surfactant-associated protein, SP-B, is composed of an NH2-terminal domain of 30 residues (a-type domain) and three tandem repeats of about 90 residues (b-type domain); biophysically active mature SP-B corresponds to the second b-type repeat. Consensus sequences constructed for the b-type repeats were used to search the data base for homologous sequences, and the search has revealed that prosaposin and sulfated glycoprotein 1 show a remarkable homology with these repeats. The domain organizations of the latter proteins, however, differ from that of SP-B precursor inasmuch as they contain four tandem copies of the b-type domain and a-type domains are present both in the NH2-terminal and COOH-terminal parts of the proteins. The implications of the homology of saposins and SP-B for their structure and function are discussed.  相似文献   

12.
The tat gene of HIV-1 is a potent trans-activator of gene expression from the HIV long terminal repeat (LTR). To define the functionally important regions of the product of the tat gene (Tat) of HIV-1, deletion, linker insertion and single amino acid substitution mutants within the Tat coding region of strain SF2 were constructed. The effect of these mutations on trans-activation was assessed by measuring the expression of the bacterial chloramphenicol acetyltransferase (CAT) reporter gene linked to the HIV-LTR. These studies have revealed that four different domains of the protein that map within the N-terminal 56 amino acid region are essential for Tat function. In addition to the essential domains, an auxiliary domain that enhances the activity of the essential region has also been mapped between amino acid residues 58 and 66. One of the essential domains maps in the N-terminal 20 amino acid region. The other three essential domains are highly conserved among the various strains of HIV-1 and HIV-2 as well as simian immunodeficiency virus (SIV). Of the conserved domains, one contains seven Cys residues and single amino acid substitutions for several Cys residues indicate that they are essential for Tat function. The second conserved domain contains a Lys X Leu Gly Ile X Tyr motif in which the Lys residue is essential for trans-activation and the other residues are partially essential. The third conserved domain is strongly basic and appears to play a dual role. Mutants lacking this domain are deficient in trans-activation and in efficient targeting of Tat to the nucleus and nucleolus. The combination of the four essential domains and the auxiliary domain contribute to the near full activity observed with the 101 amino acid Tat protein.  相似文献   

13.
In this study we have identified and characterized dopamine receptor D4 (DRD4) exon III tandem repeats in 33 public available nucleotide sequences from different mammalian species. We found that the tandem repeat in canids could be described in a novel and simple way, namely, as a structure composed of 15- and 12- bp modules. Tandem repeats composed of 18-bp modules were found in sequences from the horse, zebra, onager, and donkey, Asiatic bear, polar bear, common raccoon, dolphin, harbor porpoise, and domestic cat. Several of these sequences have been analyzed previously without a tandem repeat being found. In the domestic cow and gray seal we identified tandem repeats composed of 36-bp modules, each consisting of two closely related 18-bp basic units. A tandem repeat consisting of 9-bp modules was identified in sequences from mink and ferret. In the European otter we detected an 18-bp tandem repeat, while a tandem repeat consisting of 27-bp modules was identified in a sequence from European badger. Both these tandem repeats were composed of 9-bp basic units, which were closely related with the 9-bp repeat modules identified in the mink and ferret. Tandem repeats could not be identified in sequences from rodents. All tandem repeats possessed a high GC content with a strong bias for C. On phylogenetic analysis of the tandem repeats evolutionary related species were clustered into the same groups. The degree of conservation of the tandem repeats varied significantly between species. The deduced amino acid sequences of most of the tandem repeats exhibited a high propensity for disorder. This was also the case with an amino acid sequence of the human DRD4 exon III tandem repeat, which was included in the study for comparative purposes. We identified proline-containing motifs for SH3 and WW domain binding proteins, potential phosphorylation sites, PDZ domain binding motifs, and FHA domain binding motifs in the amino acid sequences of the tandem repeats. The numbers of potential functional sites varied pronouncedly between species. Our observations provide a platform for future studies of the architecture and evolution of the DRD4 exon III tandem repeat, and they suggest that differences in the structure of this tandem repeat contribute to specialization and generation of diversity in receptor function.  相似文献   

14.
Glucansucrases of oral streptococci and Leuconostoc mesenteroides have a common pattern of structural organization and characteristically contain a domain with a series of tandem amino acid repeats in which certain residues are highly conserved, particularly aromatic amino acids and glycine. In some glucosyltransferases (GTFs) the repeat region has been identified as a glucan binding domain (GBD). Such GBDs are also found in several glucan binding proteins (GBP) of oral streptococci that do not have glucansucrase activity. Alignment of the amino acid sequences of 20 glucansucrases and GBP showed the widespread conservation of the 33-residue A repeat first identified in GtfI of Streptococcus downei. Site-directed mutagenesis of individual highly conserved residues in recombinant GBD of GtfI demonstrated the importance of the first tryptophan and the tyrosine-phenylalanine pair in the binding of dextran, as well as the essential contribution of a basic residue (arginine or lysine). A microplate binding assay was developed to measure the binding affinity of recombinant GBDs. GBD of GtfI was shown to be capable of binding glucans with predominantly alpha-1,3 or alpha-1,6 links, as well as alternating alpha-1,3 and alpha-1,6 links (alternan). Western blot experiments using biotinylated dextran or alternan as probes demonstrated a difference between the binding of streptococcal GTF and GBP and that of Leuconostoc glucansucrases. Experimental data and bioinformatics analysis showed that the A repeat motif is distinct from the 20-residue CW motif, which also has conserved aromatic amino acids and glycine and which occurs in the choline-binding proteins of Streptococcus pneumoniae and other organisms.  相似文献   

15.
16.
Jeong EJ  Hwang GS  Kim KH  Kim MJ  Kim S  Kim KS 《Biochemistry》2000,39(51):15775-15782
Human bifunctional glutamyl-prolyl-tRNA synthetase (EPRS) contains three tandem repeats linking the two catalytic domains. These repeated motifs have been shown to be involved in protein-protein and protein-nucleic acid interactions. The single copy of the homologous motifs has also been found in several different aminoacyl-tRNA synthetases. The solution structure of repeat 1 (EPRS-R1) and the secondary structure of the whole appended domain containing three repeated motifs in EPRS (EPRS-R123) was determined by nuclear magnetic resonance (NMR) spectroscopy. EPRS-R1 consists of two helices (residues 679-699 and 702-721) arranged in a helix-turn-helix, which is similar to other RNA binding proteins and the j-domain of DnaJ, and EPRS-R123 is composed of three helix-turn-helix motifs linked by an unstructured loop. When tRNA is bound to the appended domain, chemical shifts of several residues in each repeat are perturbed. However, the perturbed residues in each repeat are not the same although they are in the same binding surface, suggesting that each repeat in the appended domain is dynamically arranged to maximize contacts with tRNA. The affinity of tRNA to the three-repeated motif was much higher than to the single motif. These results indicate that each of the repeated motifs has a weak intrinsic affinity for tRNA, but the repetition of the motifs may be required to enhance binding affinity. Thus, the results of this work gave information on the RNA-binding mode of the multifunctional peptide motif attached to different ARSs and the functional reason for the repetition of this motif.  相似文献   

17.
We have determined the nucleotide and amino acid sequences of mouse alpha 2(IV) collagen which is 1707 amino acids long. The primary structure includes a putative 28-residue signal peptide and contains three distinct domains: 1) the 7 S domain (residues 29-171), which contains 5 cysteine and 8 lysine residues, is involved in the cross-linking and assembly of four collagen IV molecules; 2) the triple-helical domain (residues 172-1480), which has 24 sequence interruptions in the Gly-X-Y repeat up to 24 residues in length; and 3) the NC1 domain (residues 1481-1707), which is involved in the end-to-end assembly of collagen IV and is the most highly conserved domain of the protein. Alignment of the primary structure of the alpha 2(IV) chain with that of the alpha 1(IV) chain reported in the accompanying paper (Muthukumaran, G., Blumberg, B., and Kurkinen, M. (1989) J. Biol. Chem. 264, 6310-6317) suggests that a heterotrimeric collagen IV molecule contains 26 imperfections in the triple-helical domain. The proposed alignment is consistent with the physical data on the length and flexibility of collagen IV.  相似文献   

18.
We present a novel approach to design repeat proteins of the leucine-rich repeat (LRR) family for the generation of libraries of intracellular binding molecules. From an analysis of naturally occurring LRR proteins, we derived the concept to assemble repeat proteins with randomized surface positions from libraries of consensus repeat modules. As a guiding principle, we used the mammalian ribonuclease inhibitor (RI) family, which comprises cytosolic LRR proteins known for their extraordinary affinities to many RNases. By aligning the amino acid sequences of the internal repeats of human, pig, rat, and mouse RI, we derived a first consensus sequence for the characteristic alternating 28 and 29 amino acid residue A-type and B-type repeats. Structural considerations were used to replace all conserved cysteine residues, to define less conserved positions, and to decide where to introduce randomized amino acid residues. The so devised consensus RI repeat library was generated at the DNA level and assembled by stepwise ligation to give libraries of 2-12 repeats. Terminal capping repeats, known to shield the continuous hydrophobic core of the LRR domain from the surrounding solvent, were adapted from human RI. In this way, designed LRR protein libraries of 4-14 LRRs (equivalent to 130-415 amino acid residues) were obtained. The biophysical analysis of randomly chosen library members showed high levels of soluble expression in the Escherichia coli cytosol, monomeric behavior as characterized by gel-filtration, and alpha-helical CD spectra, confirming the success of our design approach.  相似文献   

19.
The core domain of human immunodeficiency virus type 1 (HIV-1) integrase (IN) contains a D,D(35)E motif, named for the phylogenetically conserved glutamic acid and aspartic acid residues and the invariant 35 amino acid spacing between the second and third acidic residues. Each acidic residue of the D,D(35)E motif is independently essential for the 3′-processing and strand transfer activities of purified HIV-1 IN protein. Using a replication-defective viral genome with a hygromycin selectable marker, we recently reported that a mutation at any of the three residues of the D,D(35)E motif produces a 103- to 104-fold reduction in infectious titer compared with virus encoding wild-type IN (A. D. Leavitt et al., J. Virol. 70:721–728. 1996). The infectious titer, as measured by the number of hygromycin-resistant colonies formed following infection of cells in culture, was less than a few hundred colonies per μg of p24. To understand the mechanism by which the mutant virions conferred hygromycin resistance, we characterized the integrated viral DNA in cells infected with virus encoding mutations at each of the three residues of the D,D(35)E motif. We found the integrated viral DNA to be colinear with the incoming viral genome. DNA sequencing of the junctions between integrated viral DNA and host DNA showed that (i) the characteristic 5-bp direct repeat of host DNA flanking the HIV-1 provirus was not maintained, (ii) integration often produced a deletion of host DNA, (iii) integration sometimes occurred without the viral DNA first undergoing 3′-processing, (iv) integration sites showed a strong bias for a G residue immediately adjacent to the conserved viral CA dinucleotide, and (v) mutations at each of the residues of the D,D(35)E motif produced essentially identical phenotypes. We conclude that mutations at any of the three acidic residues of the conserved D,D(35)E motif so severely impair IN activity that most, if not all, integration events by virus encoding such mutations are not IN mediated. IN-independent provirus formation may have implications for anti-IN therapeutic agents that target the IN active site.  相似文献   

20.
The neuronal ceroid lipofuscinoses (NCL) are a group of progressive neurodegenerative disorders characterized by the deposition of autofluorescent proteinaceous fingerprint or curvilinear bodies. We have found that CLN3, the gene underlying the juvenile form of NCL, is very tightly linked to the dinucleotide repeat marker D16S285 on chromosome 16. Integration of D16S285 into the genetic map of chromosome 16 by using the Centre d'Etude du Polymorphisme Humain panel of reference pedigrees yielded a favored marker order in the CLN3 region of qtel-D16S150-.08-D16S285-.04-D16S148-.02-D16S 67-ptel. The most likely location of the disease gene, near D16S285 in the D16S150-D16S148 interval, was favored by odds of greater than 10(4):1 over the adjacent D16S148-D16S67 interval, which was recently reported as the minimum candidate region. Analysis of D16S285 in pedigrees with late-infantile NCL virtually excluded the CLN3 region, suggesting that these two forms of NCL are genetically distinct.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号