首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We have identified four novel repeats and two domains in cell surface proteins encoded by the Methanosarcina acetivorans genome and in some archaeal and bacterial genomes. The repeats correspond to a certain number of amino acid residues present in tandem in a protein sequence and each repeat is characterized by conserved sequence motifs. These correspond to: (a) a 42 amino acid (aa) residue RIVW repeat; (b) a 45 aa residue LGxL repeat; (c) a 42 aa residue LVIVD repeat; and (d) a 54 aa residue LGFP repeat. The domains correspond to a certain number of aa residues in a protein sequence that do not comprise internal repeats. These correspond to: (a) a 200 aa residue DNRLRE domain; and (b) a 70 aa residue PEGA domain. We discuss the occurrence of these repeats and domains in the different proteins and genomes analysed in this work.  相似文献   

2.
Amino acid sequence analysis corresponding to the PPE proteins in H37Rv and CDC 1551 strains of theMycobacterium tuberculosis genomes resulted in the identification of a previously uncharacterized 225 amino acid-residue common region in 22 proteins. The pairwise sequence identities were as low as 18%. Conservation of amino acid residues was observed at fifteen positions that were distributed over the whole length of the region. The secondary structure corresponding to this region is predicted to be a mixture of a-helices and β-strands. Although the function is not known, proteins with this region specific to mycobacterial species may be associated with a common function. We further observed another group of 20 PPE proteins corresponding to the conserved C-terminal region comprising 44 amino acid residues with GFxGT and PxxPxxW sequence motifs. This region is preceded by a hydrophobic region, comprising 40–100 amino acid residues, that is flanked by charged amino acid residues. Identification of conserved regions described above may be useful to detect related proteins from other genomes and assist the design of suitable experiments to test their corresponding functions. Amino acid sequence analysis corresponding to the PE proteins resulted in the identification of tandem repeats comprising 41-43 amino acid residues in the C-terminal variable regions in two PE proteins (Rv0978 and Rv0980). These correspond to the AB repeats that were first identified in some proteins of theMethanosarcina mazei genome, and were demonstrated as surface antigens. We observed the AB repeats also in several other proteins of hitherto uncharacterized function inArchaea andBacteria genomes. Some of these proteins are also associated with another repeat called the C-repeat or the PKD-domain comprising 85 amino acid residues. The secondary structure corresponding to the AB repeat is predicted mainly as 4 β-strands. We suggest that proteins with AB repeats inMycobacterium tuberculosis and other genomes may be associated as surface antigens. TheM. leprae genome, however, does not contain either the AB or C-repeats and different proteins may therefore be recruited as surface antigens in theM. leprae genome compared to theM. tuberculosis genome.  相似文献   

3.
Full-length sequence of the cDNA for human erythroid beta-spectrin   总被引:22,自引:0,他引:22  
Spectrin is the major molecular consituent of the red cell membrane skeleton. We have isolated overlapping human erythroid beta-spectrin cDNA clones and determined 6773 base pairs of contiguous nucleotide sequence. This includes the entire coding sequence of beta-spectrin. The sequence translates into a 2137 amino acid, 246-kDa peptide. beta-Spectrin is found to consist of three distinct domains. Domain I, at the N terminus, is a 272-amino acid region lacking resemblance to the spectrin repetitive motif. Sequences in this region exhibit striking sequence homology, at both nucleotide and amino acid levels, to the N-terminal "actin-binding" domains of alpha-actinin and dystrophin. Between residues 51 and 270 there is 55% amino acid identity to human dystrophin, with only four single amino acid gaps in alignment. Domain II consists of 17 spectrin repeats. Several sequence variations are observed in typical repeat structure. Homology to alpha-actinin extends beyond domain I into the N-terminal portion of domain II. Domain III, 52 amino acid residues at the C terminus, does not adhere to the spectrin repeat motif. Combining knowledge of spectrin primary structure with previously reported functional studies, it is possible to make several inferences regarding structure/function relationships within the beta-spectrin molecule.  相似文献   

4.
ShdA is a large outer membrane protein of the autotransporter family whose passenger domain binds the extracellular matrix proteins fibronectin and collagen I, possibly by mimicking the host ligand heparin. The ShdA passenger domain consists of approximately 1,500 amino acid residues that can be divided into two regions based on features of the primary amino acid sequence: an N-terminal nonrepeat region followed by a repeat region composed of two types of imperfect direct amino acid repeats, called type A and type B. The repeat region bound bovine fibronectin with an affinity similar to that for the complete ShdA passenger domain, while the nonrepeat region exhibited comparatively low fibronectin-binding activity. A number of fusion proteins containing truncated fragments of the repeat region did not bind bovine fibronectin. However, binding of the passenger domain to fibronectin was inhibited in the presence of immune serum raised to one truncated fragment of the repeat region that contained repeats A2, B8, A3, and B9. Furthermore, a monoclonal antibody that specifically recognized an epitope in a recombinant protein containing the A3 repeat inhibited binding of ShdA to fibronectin.  相似文献   

5.
Structural predictions for the central domain of dystrophin   总被引:10,自引:0,他引:10  
The amino acid sequence of dystrophin indicates that the molecule has globular N- and C-terminal domains separated by a long central rod domain. The central rod contains multiple repeats, about 100 amino acids long and of variable length. These diverge sufficiently in sequence that, in previous studies, only 14 of the most similar repeats have been aligned and analysed in any detail. We show here that a heptad pattern of hydrophobic residues is preserved across all repeats. Using the heptad pattern together with a consensus sequence template, we identified and aligned 25 repeats in the dystrophin rod sequence. Each repeat consists of a constant-length core helix of 54 residues, coupled via a short linker to a weakly conserved variable-length helix, and then via a second linker to the next core. The variable-length helix appears truncated in repeats 10 and 13 and extended in repeats 4 and 20. The extension of repeat 20 is particularly interesting since it corresponds to a hotspot of dystrophy-inducing mutations. Detailed modelling suggests that the classical Speicher-Marchesi [(1984) Nature 311, 177-180] model for spectrin may not be appropriate to dystrophin without some modification. We propose that whilst the repeating structural motif in dystrophin is probably a bead of triple coiled coil, this bead is twice as massive as, and out of phase with, those proposed for spectrin. Our model raises the possibility that the rod domain of dystrophin may confer elasticity on the molecule. Deletions which truncate this region would then reduce the extensibility of the molecule without affecting actin crosslinking, consistent with their typically producing the relatively benign Becker phenotype of muscular dystrophy.  相似文献   

6.
We report the complete amino acid sequence of bovine conglutinin obtained by structural characterization of peptides derived from the protein by various chemical and enzymatic fragmentation methods. The protein consists of 351 amino acid residues including 55 apparent Gly-X-Y repeats with two interruptions. This 171-residue-long collagenous domain separates a short noncollagenous NH2-terminal region of 25 residues from the 155-residue-long globular COOH terminus revealing the structural relation of conglutinin with mannose-binding proteins, pulmonary surfactant-associated proteins, and a complement component C1q. Eight hydroxylysine residues were found in the collagenous domain. All of these hydroxylysine residues which occupy a Y position in a Gly-X-Y triplet are possible glycosylation sites since no phenylthiohydantoin amino acid was identified in automated Edman degradation cycles corresponding to these sites. The noncollagenous COOH domain of conglutinin, on the other hand, contains a carbohydrate recognition domain which shares substantial sequence homology with C-type animal lectins. Conglutinin has the greatest sequence similarity with mannose-binding proteins and pulmonary surfactant-associated proteins.  相似文献   

7.
Analysis of the sequence for the gene encoding PspA (pneumococcal surface protein A) of Streptococcus pneumoniae revealed the presence of four distinct domains in the mature protein. The structure of the N-terminal half of PspA was highly consistent with that of an alpha-helical coiled-coil protein. The alpha-helical domain was followed by a proline-rich domain (with two regions in which 18 of 43 and 5 of 11 of the residues are prolines) and a repeat domain consisting of 10 highly conserved 20-amino-acid repeats. A fourth domain consisting of a hydrophobic region too short to serve as a membrane anchor and a poorly charged region followed the repeats and preceded the translation stop codon. The C-terminal region of PspA did not possess features conserved among numerous other surface proteins, suggesting that PspA is attached to the cell by a mechanism unique among known surface proteins of gram-positive bacteria. The repeat domain of PspA was found to have significant homology with C-terminal repeat regions of proteins from Streptococcus mutans, Streptococcus downei, Clostridium difficile, and S. pneumoniae. Comparisons of these regions with respect to functions and homologies suggested that, through evolution, the repeat regions may have lost or gained a mechanism for attachment to the bacterial cell.  相似文献   

8.
We present a novel approach to design repeat proteins of the leucine-rich repeat (LRR) family for the generation of libraries of intracellular binding molecules. From an analysis of naturally occurring LRR proteins, we derived the concept to assemble repeat proteins with randomized surface positions from libraries of consensus repeat modules. As a guiding principle, we used the mammalian ribonuclease inhibitor (RI) family, which comprises cytosolic LRR proteins known for their extraordinary affinities to many RNases. By aligning the amino acid sequences of the internal repeats of human, pig, rat, and mouse RI, we derived a first consensus sequence for the characteristic alternating 28 and 29 amino acid residue A-type and B-type repeats. Structural considerations were used to replace all conserved cysteine residues, to define less conserved positions, and to decide where to introduce randomized amino acid residues. The so devised consensus RI repeat library was generated at the DNA level and assembled by stepwise ligation to give libraries of 2-12 repeats. Terminal capping repeats, known to shield the continuous hydrophobic core of the LRR domain from the surrounding solvent, were adapted from human RI. In this way, designed LRR protein libraries of 4-14 LRRs (equivalent to 130-415 amino acid residues) were obtained. The biophysical analysis of randomly chosen library members showed high levels of soluble expression in the Escherichia coli cytosol, monomeric behavior as characterized by gel-filtration, and alpha-helical CD spectra, confirming the success of our design approach.  相似文献   

9.
10.
The amino acid sequences of chick and slime mould alpha-actinin each contain four repeats of approximately 122 residues. These repeats are homologous to the 18-22 repeats, each of approximately 106 residues, found in the alpha and beta subunits of spectrin and fodrin, and to the multiple repeats of approximately 110 residues found in the Duchenne muscular dystrophy protein (dystrophin). The repeats correspond to the elongated rod-like portion of these molecules. We present a multiple sequence alignment of 21 repeats from this superfamily (8 alpha-actinin and 13 spectrin/fodrin), based on optimal pairwise alignments, from which a characteristic consensus pattern of amino acid types is deduced. Trp 46 is invariant in all but one repeat, and physicochemical classes of amino acids are conserved at 25 other positions. Secondary structure prediction on both the alpha-actinin and spectrin repeats taken together with the distribution of proline residues in the sequences, strongly suggest that each repeated domain consists of a four-helix structure. Our predictions differ significantly from previous three-helix models based on analyses of fewer sequences. To determine possible interdomain regions, sites of limited proteolysis of the native chick alpha-actinin dimer were determined and located in the amino acid sequence. The majority of these sites were in corresponding positions in different repeats within a segment predicted as a long helix. We propose a model, consistent with the overall dimensions of the rod-like portions of the molecules, in which these long, probably interrupted helices, link adjacent domains.  相似文献   

11.
Cloning and sequencing of a human pancreatic tumor mucin cDNA   总被引:24,自引:0,他引:24  
A monospecific polyclonal antiserum against deglycosylated human pancreatic tumor mucin was used to select human pancreatic mucin cDNA clones from a lambda gt11 cDNA expression library developed from a human pancreatic tumor cell line. The full-length 4.4-kilobase mucin cDNA sequence included a 72-base pair 5'-untranslated region and a 307-base pair 3'-untranslated region. The predicted amino acid sequence for this cDNA revealed a protein of 122,071 daltons containing 1,255 amino acid residues of which greater than 60% were serine, threonine, proline, alanine, and glycine. Approximately two-thirds of the protein sequence consisted of identical 20-amino acid tandem repeats which were flanked by degenerate tandem repeats and nontandem repeat sequences on both the amino-terminal and carboxyl-terminal ends. The amino acid sequence also contained five putative N-linked glycosylation sites, a putative signal sequence and transmembrane domain, and numerous serine and threonine residues (potential O-linked glycosylation sites) outside and within the tandem repeat position. The cDNA and deduced amino acid sequence of the pancreatic mucin sequence was over 99% homologous with a mucin cDNA sequence derived from breast tumor mucin, even though the native forms of these molecules are quite distinct in size and degree of glycosylation.  相似文献   

12.
J Kochan  M Perkins  J V Ravetch 《Cell》1986,44(5):689-696
Erythrocyte invasion by the malarial merozoite is a receptor-mediated process, an obligatory step in the development of the parasite. The Plasmodium falciparum protein GBP-130, which binds to the erythrocyte receptor glycophorin, is shown here to encode the binding site in a domain composed of a tandemly repeated 50 amino acid sequence. The amino acid sequence of GBP-130, deduced from the cloned and sequenced gene, reveals that the protein contains 11 highly conserved 50 amino acid repeats and a charged N-terminal region of 225 amino acids. Binding studies on recombinant proteins expressing different numbers of repeats suggest that a correlation exists between glycophorin binding and repeat number. Thus, a repeat domain, a common feature of plasmodial antigens, has been shown to have a function independent of the immune system. This conclusion is further supported by the ability of antibodies directed against the repeat sequence to inhibit the in vitro invasion of erythrocytes by merozoites.  相似文献   

13.
We have used antibodies to the basement membrane proteoglycan to screen lambda gt11 expression vector libraries and have isolated two cDNA clones, termed BPG 5 and BPG 7, which encode different portions of the core protein of the heparan sulfate basement membrane proteoglycan. These clones hybridize to a single mRNA species of approximately 12 kilobases. Amino acid sequences obtained on peptides derived from protease digests of the core protein were found in the deduced sequence, confirming the identity of these clones. BPG 5 spanned 1986 base pairs and has an open reading frame of 662 amino acids. The amino acid sequence deduced from BPG 5 contains two cysteine-rich domains and two internally homologous domains lacking cysteine. The cysteine-rich domains show homology to the cysteine-rich domains of the laminin chains. A globule-rod structure, similar to that of the short arms of the laminin chains, is proposed for this region of the proteoglycan. The other clone, BPG 7, is 2193 base pairs long and has an open reading frame of 731 amino acids. The deduced sequence contains eight internal repeats with 2 cysteine residues in each repeat. These repeats show homology to the neural-cell adhesion molecule N-CAM and the plasma alpha 1B-glycoprotein. Looping structures similar to these proteins and to other proteins of the immunoglobulin gene superfamily are proposed for this region of the proteoglycan. The sequence DSGEY was found four times in this domain and could be heparan sulfate attachment sites.  相似文献   

14.
The main structural domains of prion proteins, in particular the N-terminal region containing characteristic amino acid repeats, are well conserved among different species, despite divergence in primary sequence. The repeat region seems to play an important role, as verified by pathogenicity only observed in organisms having repeats composed of eight residues. In this work three different peptides belonging to the tandem repeat region of StPrP-2 from the Japanese pufferfish Takifugu rubripes have been considered; the coordination modes and conformations of their complexes with Cu(II) have been investigated by using potentiometric titrations, spectroscopic data, and restrained molecular dynamics simulations. In all cases the histidine imidazole(s) provide the anchoring site for copper, with the further involvement of amide nitrogens depending on the peptide sequence and on pH. An increase in copper binding affinity has been observed going from the shortest peptide, corresponding to a single repeat and containing two histidines, to the longest one, encompassing three repeats with six histidines.  相似文献   

15.
Streptococcus pyogenes expresses a fibronectin-binding surface protein (Sfb protein) which mediates adherence to human epithelial cells. The nucleotide sequence of the sfb gene was determined and the primary sequence of the Sfb protein was analysed. The protein consists of 638 amino acids and comprises five structurally distinct domains. The protein starts with an N-terminal signal peptide followed by an aromatic domain. The central part of the protein is formed by four proline-rich repeats which are flanked by non-repetitive spacer sequences. A second repeat region, consisting of four repeats that are distinct from the proline repeats and have been shown to form the fibronectin-binding domain, is located in the Cterminal part of the protein. The protein ends with a typical cell wall and membrane anchor region. Comparative sequence analysis of the N-terminal aromatic domain revealed similarities with carbohydrate-binding sites of other proteins. The proline repeat region of the Sfb protein shares characteristic features with proline-rich repeats of functionally distinct surface proteins from pathogenic Gram-positive cocci. Immunoelectron microscopy revealed an even distribution of the fibronectin-binding domain of Sfb protein on the surface of streptococcal cells. Analyses of 38 sfb genes originating from different S. pyogenes isolates revealed primary sequence variability in regions coding for the N-termini of mature Sfb proteins, whereas sequences coding for the central and C-terminal repeats were highly conserved. The repeat sequences are postulated to act as target sites for intragenic recombination events that result in variable numbers of repeats within the different sfb genes. A model of the Sfb protein is presented.  相似文献   

16.
The location of 16 of the 18 disulfide bonds in human plasma prekallikrein was determined by amino acid sequence analysis of cystinyl peptides produced by chemical and enzymatic digestions. A unique structure, named the apple domain, was established for each of the four tandem repeats in the amino-terminal portion of the molecule. The apple domains (90 or 91 amino acids) contain 3 highly conserved disulfide bonds linking the first and sixth, second and fifth, and third and fourth half-cystine residues present in each repeat. The fourth tandem repeat contains an extra disulfide bond that forms a second small loop within the apple domain. The carboxyl-terminal portion of plasma prekallikrein containing the catalytic region of the molecule was found to have disulfide bonds located in positions similar to those of other serine proteases.  相似文献   

17.
The complete sequence of dystrophin predicts a rod-shaped cytoskeletal protein   总被引:181,自引:0,他引:181  
M Koenig  A P Monaco  L M Kunkel 《Cell》1988,53(2):219-228
The complete sequence of the human Duchenne muscular dystrophy (DMD) cDNA has been determined. The 3685 encoded amino acids of the protein product, dystrophin, can be separated into four domains. The 240 amino acid N-terminal domain has been shown to be conserved with the actin-binding domain of alpha-actinin. A large second domain is predicted to be rod-shaped and formed by the succession of 25 triple-helical segments similar to the repeat domains of spectrin. The repeat segment is followed by a cysteine-rich segment that is similar in part to the entire COOH domain of the Dictyostelium alpha-actinin, while the 420 amino acid C-terminal domain of dystrophin does not show any similarity to previously reported proteins. The functional significance of some of the domains is addressed relative to the phenotypic characteristics of some Becker muscular dystrophy patients. Dystrophin shares many features with the cytoskeletal protein spectrin and alpha-actinin and is a large structural protein that is likely to adopt a rod shape about 150 nm in length.  相似文献   

18.
19.
The tat gene of HIV-1 is a potent trans-activator of gene expression from the HIV long terminal repeat (LTR). To define the functionally important regions of the product of the tat gene (Tat) of HIV-1, deletion, linker insertion and single amino acid substitution mutants within the Tat coding region of strain SF2 were constructed. The effect of these mutations on trans-activation was assessed by measuring the expression of the bacterial chloramphenicol acetyltransferase (CAT) reporter gene linked to the HIV-LTR. These studies have revealed that four different domains of the protein that map within the N-terminal 56 amino acid region are essential for Tat function. In addition to the essential domains, an auxiliary domain that enhances the activity of the essential region has also been mapped between amino acid residues 58 and 66. One of the essential domains maps in the N-terminal 20 amino acid region. The other three essential domains are highly conserved among the various strains of HIV-1 and HIV-2 as well as simian immunodeficiency virus (SIV). Of the conserved domains, one contains seven Cys residues and single amino acid substitutions for several Cys residues indicate that they are essential for Tat function. The second conserved domain contains a Lys X Leu Gly Ile X Tyr motif in which the Lys residue is essential for trans-activation and the other residues are partially essential. The third conserved domain is strongly basic and appears to play a dual role. Mutants lacking this domain are deficient in trans-activation and in efficient targeting of Tat to the nucleus and nucleolus. The combination of the four essential domains and the auxiliary domain contribute to the near full activity observed with the 101 amino acid Tat protein.  相似文献   

20.
Colonization of oral tissues by Streptococcus sanguis may be influenced by a mucin-like salivary glycoprotein (SAG) through a calcium-dependent interaction with a specific bacterial receptor. We report the nucleotide and deduced amino acid sequence of the S. sanguis receptor (SSP-5) and show that this protein may bind sialic acid residues of SAG. The SSP-5 protein contains three unique structural domains, two of which consist of repetitive amino acid sequences. The N-terminal domain is comprised of four tandem copies of an 82-residue repeat which exhibits homology to M protein of Streptococcus pyogenes. This region is highly charged and predicted to be alpha-helical. A second hydrophilic repetitive domain consists of three copies of a 39-amino acid sequence containing 30% proline flanked by nonrepetitive proline-rich sequence. The third domain consists of 48% proline and resides near the C terminus of the protein. Secondary structure analysis of the SSP-5 sequence also identified four potential helix-turn-helix motifs that resembled E-F hand calcium binding domains. The SSP-5 protein is highly homologous to a surface antigen expressed by the mutans streptococci and the domain structure of SSP-5 is conserved within this family of proteins. The interactions of SSP-5 and of intact S. sanguis with SAG were inhibited by neuraminidase digestion of the salivary glycoprotein and by simple sugars containing sialic acid, suggesting that sialic acid is the primary ligand involved in the binding reaction.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号