首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Many malarial antigens contain extensive arrays of tandemly repeated short amino acid sequences, and much of the antibody response induced by malaria infections is directed against these repeats. Indeed, it has been hypothesized that these repeats function to elicit a relatively ineffective T-cell-independent antibody response by the host. In order to test this hypothesis, tandem repeats of Plasmodium species were examined for a bias in composition favoring amino acids likely to form epitopes for the antibody. The genome of Plasmodium is very A+T-rich, and nucleotide compositional bias will, in itself, lead to a high proportion of hydrophilic amino acids. When this bias was controlled for, Plasmodium antigens did not show a higher proportion of hydrophilic amino acids than expected, but there was a significant reduction in the proportion of hydrophobic amino acids in the repeats of the antigens. The amino acid composition of the repeats was thus strikingly different from those seen both in the remainder of the antigens and in a sample of Plasmodium falciparum housekeeping genes.  相似文献   

2.
3.
The primary amino acid sequence of contactin, a neuronal cell surface glycoprotein of 130 kD that is isolated in association with components of the cytoskeleton (Ranscht, B., D. J. Moss, and C. Thomas. 1984. J. Cell Biol. 99:1803-1813), was deduced from the nucleotide sequence of cDNA clones and is reported here. The cDNA sequence contains an open reading frame for a 1,071-amino acid transmembrane protein with 962 extracellular and 89 cytoplasmic amino acids. In its extracellular portion, the polypeptide features six type 1 and two type 2 repeats. The six amino-terminal type 1 repeats (I-VI) each consist of 81-99 amino acids and contain two cysteine residues that are in the right context to form globular domains as described for molecules with immunoglobulin structure. Within the proposed globular region, contactin shares 31% identical amino acids with the neural cell adhesion molecule NCAM. The two type 2 repeats (I-II) are each composed of 100 amino acids and lack cysteine residues. They are 20-31% identical to fibronectin type III repeats. Both the structural similarity of contactin to molecules of the immunoglobulin supergene family, in particular the amino acid sequence resemblance to NCAM, and its relationship to fibronectin indicate that contactin could be involved in some aspect of cellular adhesion. This suggestion is further strengthened by its localization in neuropil containing axon fascicles and synapses.  相似文献   

4.
Nearly all of the insulin-like growth factor (IGF) in the circulation is bound in a heterotrimeric complex composed of IGF, IGF-binding protein-3, and the acid-labile subunit (ALS). Full-length clones encoding ALS have been isolated from human liver cDNA libraries by using probes based on amino acid sequence data from the purified protein. These clones encode a mature protein of 578 amino acids preceded by a 27-amino acid hydrophobic sequence indicative of a secretion signal. Expression of the cDNA clones in mammalian tissue culture cells results in the secretion into the culture medium of ALS activity that can form the expected complex with IGF-I and IGF-binding protein-3. The amino acid sequence of ALS is largely composed of 18-20 leucine-rich repeats of 24 amino acids. These repeats are found in a number of diverse proteins that, like ALS, participate in protein-protein interactions.  相似文献   

5.
Trichohyalin is a highly expressed protein within the inner root sheath of hair follicles and is similar, or identical, to a protein present in the hair medulla. In situ hybridization studies have shown that trichohyalin is a very early differentiation marker in both tissues and that in each case the trichohyalin mRNA is expressed from the same single copy gene. A partial cDNA clone for sheep trichohyalin has been isolated and represents approximately 40% of the full-length trichohyalin mRNA. The carboxy-terminal 458 amino acids of trichohyalin are encoded, and the first 429 amino acids consist of full- or partial-length tandem repeats of a 23 amino acid sequence. These repeats are characterized by a high proportion of charged amino acids. Secondary structure analyses predict that the majority of the encoded protein could form alpha-helical structures that might form filamentous aggregates of intermediate filament dimensions, even though the heptad motif obligatory for the intermediate filament structure itself is absent. The alternative structural role of trichohyalin could be as an intermediate filament-associated protein, as proposed from other evidence.  相似文献   

6.
We report the isolation and characterization of six overlapping cDNA clones that provide the first and complete amino acid sequence of the human laminin B1 chain. The cDNA clones cover 5613 nucleotides with 5358 nucleotides in an open reading frame encoding 1786 amino acids, including a 21-residue signal peptide-like sequence. Sequence analysis demonstrated the presence of two types of internal homology repeats that were found in clusters within the polypeptide chain. The type A repeats contain about 50 amino acids of which 8 are cysteine. These repeats are present in two clusters toward the NH2-terminal end of the chain and are separated from each other by about 220 amino acids. The two clusters contain five and eight consecutive repeats each. There are two copies of consecutive type B repeats of about 40 amino acids close to the COOH-terminal end. Computer analysis of the amino acid sequence of the B1 chain revealed the presence of structurally distinct domains that contain cysteine-rich repeats, globular regions, and helical structures. Using somatic cell hybrid methodology and in situ hybridization to metaphase chromosomes it was established that the human laminin B1 gene (LAMB1) is located in the q22 region of chromosome 7.  相似文献   

7.
cDNA clones for nuclear pore complex glycoprotein p62 of two distantly related species, mouse and Xenopus laevis, were isolated. Antibodies raised against recombinant murine p62 react on protein blots with p62 of both species and decorate pore complexes. Analysis of the predicted protein sequence indicates that vertebrate p62 is organized into two structurally different regions. The entire carboxy-terminal half (86.7% identical amino acids) and the amino-terminal 56 amino acids (62.5% identity) have been highly conserved during evolution. The amino-terminal half contains several penta amino acid repeats and is able to form beta-sheets, whereas the carboxy-terminal half is predominantly organized in alpha-helical structures in part with heptad repeats typical for intermediate filament proteins. p62 of mouse and Xenopus is glycosylated by N-acetylglucosamine additions in the amino-terminal half. The region containing these potential glycosylation sites has been identified.  相似文献   

8.
Complete chromosome/genome sequences available from humans, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, and Saccharomyces cerevisiae were analyzed for the occurrence of mono-, di-, tri-, and tetranucleotide repeats. In all of the genomes studied, dinucleotide repeat stretches tended to be longer than other repeats. Additionally, tetranucleotide repeats in humans and trinucleotide repeats in Drosophila also seemed to be longer. Although the trends for different repeats are similar between different chromosomes within a genome, the density of repeats may vary between different chromosomes of the same species. The abundance or rarity of various di- and trinucleotide repeats in different genomes cannot be explained by nucleotide composition of a sequence or potential of repeated motifs to form alternative DNA structures. This suggests that in addition to nucleotide composition of repeat motifs, characteristic DNA replication/repair/recombination machinery might play an important role in the genesis of repeats. Moreover, analysis of complete genome coding DNA sequences of Drosophila, C. elegans, and yeast indicated that expansions of codon repeats corresponding to small hydrophilic amino acids are tolerated more, while strong selection pressures probably eliminate codon repeats encoding hydrophobic and basic amino acids. The locations and sequences of all of the repeat loci detected in genome sequences and coding DNA sequences are available at http://www.ncl-india.org/ssr and could be useful for further studies.  相似文献   

9.
Endoglucanase B (CenB) from the bacterium Cellulomonas fimi is divided into five discrete domains by linker sequences rich in proline and hydroxyamino acids (A. Meinke, C. Braun, N. R. Gilkes, D. G. Kilburn, R. C. Miller, Jr., and R. A. J. Warren, J. Bacteriol. 173:308-314, 1991). The catalytic domain of 608 amino acids is at the N terminus. The sequence of the first 477 amino acids in the catalytic domain is related to the sequences of cellulases in family E, which includes procaryotic and eucaryotic enzymes. The sequence of the last 131 amino acids of the catalytic domain is related to sequences present in a number of cellulases from different families. The catalytic domain alone can bind to cellulose, and this binding is mediated at least in part by the C-terminal 131 amino acids. Deletion of these 131 amino acids reduces but does not eliminate activity. The catalytic domain is followed by three domains which are repeats of a 98-amino-acid sequence. The repeats are approximately 50% identical to two repeats of 95 amino acids in a chitinase from Bacillus circulans which are related to fibronectin type III repeats (T. Watanabe, K. Suzuki, K. Oyanagi, K. Ohnishi, and H. Tanaka, J. Biol. Chem. 265:15659-15665, 1990). The C-terminal domain of 101 amino acids is related to sequences, present in a number of bacterial cellulases and xylanases from different families, which form cellulose-binding domains (CBDs). It functions as a CBD when fused to a heterologous polypeptide. Cells of Escherichia coli expressing the wild-type cenB gene accumulate both native CenB and a stable proteolytic fragment of 41 kDa comprising the three repeats and the C-terminal CBD. The 41-kDa polypeptide binds to cellulose but lacks enzymatic activity.  相似文献   

10.
Many strains of Streptococcus pyogenes are known to express a receptor for IgA. The complete nucleotide sequence of the gene for such a receptor, protein Arp4, has been determined. The deduced amino acid sequence of 386 residues includes a signal sequence of 41 amino acids and a putative membrane anchor region, both of which are homologous to similar regions in other streptococcal surface proteins. The processed form of the IgA receptor has a length of 345 amino acids and a calculated molecular weight of 39544. The N-terminal sequence of the processed form is different from that previously found for a similar IgA receptor isolated from a S. pyogenes strain of type M60. The sequence of protein Arp4 shows extensive homology to the C-terminal half of streptococcal M proteins, but not to the streptococcal IgG receptor protein G or staphlyococcal protein A. Apart from the membrane anchor, this homology includes a sequence of 119 amino acid residues containing three repeated units and a 54-residue sequence without repeats. The protein expressed in Escherichia coli is found in the periplasmic space, in which it constitutes the major protein. Protein Arp4 is the first example of a surface protein that has both immunoglobulin-binding capacity and structural features characteristic of M proteins.  相似文献   

11.
[PSI(+)], the prion form of the yeast Sup35 protein, results from the structural conversion of Sup35 from a soluble form into an infectious amyloid form. The infectivity of prions is thought to result from chaperone-dependent fiber cleavage that breaks large prion fibers into smaller, inheritable propagons. Like the mammalian prion protein PrP, Sup35 contains an oligopeptide repeat domain. Deletion analysis indicates that the oligopeptide repeat domain is critical for [PSI(+)] propagation, while a distinct region of the prion domain is responsible for prion nucleation. The PrP oligopeptide repeat domain can substitute for the Sup35 oligopeptide repeat domain in supporting [PSI(+)] propagation, suggesting a common role for repeats in supporting prion maintenance. However, randomizing the order of the amino acids in the Sup35 prion domain does not block prion formation or propagation, suggesting that amino acid composition is the primary determinant of Sup35's prion propensity. Thus, it is unclear what role the oligopeptide repeats play in [PSI(+)] propagation: the repeats could simply act as a non-specific spacer separating the prion nucleation domain from the rest of the protein; the repeats could contain specific compositional elements that promote prion propagation; or the repeats, while not essential for prion propagation, might explain some unique features of [PSI(+)]. Here, we test these three hypotheses and show that the ability of the Sup35 and PrP repeats to support [PSI(+)] propagation stems from their amino acid composition, not their primary sequences. Furthermore, we demonstrate that compositional requirements for the repeat domain are distinct from those of the nucleation domain, indicating that prion nucleation and propagation are driven by distinct compositional features.  相似文献   

12.
Yoshida H  Goedert M 《Biochemistry》2002,41(51):15203-15211
Tau is a major microtubule-associated protein in mammalian brain, where it exists as multiple isoforms that are produced from a single gene by alternative mRNA splicing. Here we present the first report on the structure and function of tau protein from a nonmammalian vertebrate. In the adult chicken brain, five main tau isoforms are expressed. One isoform has three tandem repeats, two isoforms have four repeats each, and two isoforms have five repeats each. Similar to mammalian tau, some chicken tau isoforms contain an amino-terminal insert of 53 amino acids. Unlike mammalian tau, a 34 amino acid insert in the proline-rich region upstream of the repeats is alternatively spliced in chicken tau. It is preceded by a constitutively expressed sequence of 17 amino acids that is absent in tau from human and rodent brains. The expression of chicken tau isoforms and their phosphorylation are developmentally regulated, similar to what has been described in mammalian brain. Functionally, chicken tau isoforms with five repeats have the greatest ability to promote microtubule assembly, followed by isoforms with four and three repeats, respectively. The 34 amino acid insert positively influences both the rate and the extent of microtubule assembly, whereas the 53 amino acid insert only influences the extent of assembly.  相似文献   

13.
Five cellulose-binding polypeptides were detected in Cellulomonas fimi culture supernatants. Two of them are CenA and CenB, endo-beta-1,4-glucanases which have been characterized previously; the other three were previously uncharacterized polypeptides with apparent molecular masses of 120, 95, and 75 kDa. The 75-kDa cellulose-binding protein was designated endoglucanase D (CenD). The cenD gene was cloned and sequenced. It encodes a polypeptide of 747 amino acids. Mature CenD is 708 amino acids long and has a predicted molecular mass of 74,982 Da. Analysis of the predicted amino acid sequence of CenD shows that the enzyme comprises four domains which are separated by short linker polypeptides: an N-terminal catalytic domain of 405 amino acids, two repeated sequences of 95 amino acids each, and a C-terminal domain of 105 amino acids which is > 50% identical to the sequences of cellulose-binding domains in Cex, CenA, and CenB from C. fimi. Amino acid sequence comparison placed the catalytic domain of CenD in family A, subtype 1, of beta-1,4-glycanases. The repeated sequences are more than 40% identical to the sequences of three repeats in CenB and are related to the repeats of fibronectin type III. CenD hydrolyzed the beta-1,4-glucosidic bond with retention of anomeric configuration. The activities of CenD towards various cellulosic substrates were quite different from those of CenA and CenB.  相似文献   

14.
All the protein sequences from SWISS-PROT database were analyzed for occurrence of single amino acid repeats, tandem oligo-peptide repeats, and periodically conserved amino acids. Single amino acid repeats of glutamine, serine, glutamic acid, glycine, and alanine seem to be tolerated to a considerable extent in many proteins. Tandem oligo-peptide repeats of different types with varying levels of conservation were detected in several proteins and found to be conspicuous, particularly in structural and cell surface proteins. It appears that repeated sequence patterns may be a mechanism that provides regular arrays of spatial and functional groups, useful for structural packing or for one to one interactions with target molecules. To facilitate further explorations, a database of Tandem Repeats in Protein Sequences (TRIPS) has been developed and is available at URL: http://www.ncl-india.org/trips.  相似文献   

15.
We describe the nucleotide sequences of several overlapping cDNA clones specific for human glutaminyl-tRNA synthetase. The identified open reading frame indicates that the enzyme is composed of 1440 amino acids. A stretch of about 360 amino acids of the human enzyme is highly conserved in bacterial and yeast glutaminyl-tRNA synthetases. However, the human enzyme is three times larger than the bacterial and twice as large as the yeast enzyme suggesting that a considerable part of human glutaminyl-tRNA synthetase has evolved to perform functions other than the charging of tRNA. The sequence outside of the conserved core region includes three 57-amino acid repeats followed by a consecutive stretch of 11 charged amino acids. A computer assisted search of two protein data banks reveals that the human glutaminyl-tRNA synthetase shares small blocks of amino acid similarities with several other synthetases of different amino acid specificities. Interestingly, the enzyme also possesses some regions of similarities with eukaryotic translation elongation factor EF-1 but not with any other sequence stored in the protein data banks. The coding regions of human and mouse glutaminyl-tRNA synthetase cDNAs are identical at 94% of the codons. However, the 3'-noncoding regions of mouse and human mRNAs are more divergent (approximately 68%) but both possess the potential to form stable secondary structures of similar general architecture.  相似文献   

16.
MOTIVATION: We consider the problem of identifying low-complexity regions (LCRs) in a protein sequence. LCRs are regions of biased composition, normally consisting of different kinds of repeats. RESULTS: We define new complexity measures to compute the complexity of a sequence based on a given scoring matrix, such as BLOSUM 62. Our complexity measures also consider the order of amino acids in the sequence and the sequence length. We develop a novel graph-based algorithm called GBA to identify LCRs in a protein sequence. In the graph constructed for the sequence, each vertex corresponds to a pair of similar amino acids. Each edge connects two pairs of amino acids that can be grouped together to form a longer repeat. GBA finds short subsequences as LCR candidates by traversing this graph. It then extends them to find longer subsequences that may contain full repeats with low complexities. Extended subsequences are then post-processed to refine repeats to LCRs. Our experiments on real data show that GBA has significantly higher recall compared to existing algorithms, including 0j.py, CARD, and SEG. AVAILABILITY: The program is available on request.  相似文献   

17.
MOTIVATION: Tandem peptide repeats play a key role in self-assembly and aggregation processes. A notable example is the occurrence of tandem peptide repeats in prionic proteins and their role in the aggregation process that leads to the formation of the prion. One of the structural characteristics that is evident from the comparison of mammalian and yeast prion proteins is the presence of aromatic residues in their tandem repeats. These residues are accompanied by glycine residues before and/or after the aromatic amino acid. Such aromatic-glycine conjugates are also present in the tandem repeats of the large family of the bacterial ice nucleation proteins. To study the significance of such aromatic-glycine occurrences, a global analysis of all the aromatic octapeptide repeats in the Swiss-Prot and TrEMBL databases was conducted. The search pattern was formulated to compare the number of conjugates of each of the 20 natural amino acids before or after the different aromatic residues. RESULTS: The presence of aromatic-glycine conjugates appears to be significantly higher than aromatic conjugates to any other amino acid. Furthermore, all the six various combination of glycine occurrences before or after the three aromatic residues are present. No such pattern was observed for any other amino acid. The significance of the findings is being discussed in the context of the physicochemical properties of aromatic-glycine conjugates and its possible role in the facilitation of aggregates formation.  相似文献   

18.
The amino acid sequences of chick and slime mould alpha-actinin each contain four repeats of approximately 122 residues. These repeats are homologous to the 18-22 repeats, each of approximately 106 residues, found in the alpha and beta subunits of spectrin and fodrin, and to the multiple repeats of approximately 110 residues found in the Duchenne muscular dystrophy protein (dystrophin). The repeats correspond to the elongated rod-like portion of these molecules. We present a multiple sequence alignment of 21 repeats from this superfamily (8 alpha-actinin and 13 spectrin/fodrin), based on optimal pairwise alignments, from which a characteristic consensus pattern of amino acid types is deduced. Trp 46 is invariant in all but one repeat, and physicochemical classes of amino acids are conserved at 25 other positions. Secondary structure prediction on both the alpha-actinin and spectrin repeats taken together with the distribution of proline residues in the sequences, strongly suggest that each repeated domain consists of a four-helix structure. Our predictions differ significantly from previous three-helix models based on analyses of fewer sequences. To determine possible interdomain regions, sites of limited proteolysis of the native chick alpha-actinin dimer were determined and located in the amino acid sequence. The majority of these sites were in corresponding positions in different repeats within a segment predicted as a long helix. We propose a model, consistent with the overall dimensions of the rod-like portions of the molecules, in which these long, probably interrupted helices, link adjacent domains.  相似文献   

19.

Background  

Genome wide and cross species comparisons of amino acid repeats is an intriguing problem in biology mainly due to the highly polymorphic nature and diverse functions of amino acid repeats. Innate protein repeats constitute vital functional and structural regions in proteins. Repeats are of great consequence in evolution of proteins, as evident from analysis of repeats in different organisms. In the post genomic era, availability of protein sequences encoded in different genomes provides a unique opportunity to perform large scale comparative studies of amino acid repeats. ProtRepeatsDB is a relational database of perfect and mismatch repeats, access to which is designed as a resource and collection of tools for detection and cross species comparisons of different types of amino acid repeats.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号