首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Filaggrin is an intermediate filament-associated protein which functions to aggregate keratin intermediate filaments in the stratum corneum of mammalian epidermis. It is synthesized as a large precursor protein, profilaggrin, that consists of multiple filaggrin units and is localized in keratohyalin granules. In this report, we describe the characterization of cosmid genomic clones containing the human profilaggrin gene coding for 11 complete filaggrin repeats of 324 amino acids each. At the amino- and carboxyl-terminal ends of human profilaggrin are leader and tail peptide sequences of 293 and 157 amino acids, respectively, which differ from filaggrin. The leader peptide is composed of two distinct domains: an 81-residue segment which shows significant homology to the S-100 family of EF hand-containing calcium-binding proteins, and a hydrophilic second domain of 212 residues. The gene is divided into three exons, with one intron (approximately 9.6 kilobase pairs) in the 5' noncoding region and a second one of 570 base pairs between the EF hands. The position of intron 2 is identical to that of other members of the S-100-like family. The presence of an S-100-like domain suggests that profilaggrin binds calcium and that the calcium binding domain is functionally significant in the formation of keratohyalin and/or the subsequent processing of profilaggrin to filaggrin, both of which may be calcium-dependent events.  相似文献   

3.
Filaggrin is an intermediate filament-associated protein that is involved in aggregation of keratin filaments in fully cornified cells of the mammalian epidermis, and is an important marker for epidermal differentiation. In this report, the sequence of a rat cDNA clone coding for a portion of the polymeric precursor, profilaggrin, is presented. The cDNA is 2,314 bp long with 1,875 bp of coding region ending with an A-T-rich 3' noncoding region. Genomic analysis indicates that the profilaggrin gene consists of 20 +/- 2 repeats of 1,218 bp of sequence coding for 406 amino acids, making the mRNA at least 25-27 kb in length. Each repeat consists of a filaggrin domain and a linker sequence with an estimated size of 380 and 26 amino acids, respectively. High levels of profilaggrin mRNA are found only in keratinizing epithelia. Comparison of the rat filaggrin sequence with that of mouse and human filaggrin and with the sequence of phosphorylated peptides from mouse profilaggrin indicates that the proteins share extensive amino acid sequence similarities, especially in the two phosphorylated regions. Proteolytic processing sites are also quite similar in rat and mouse. The three species show blocks of sequence that are similar in length and composition which alternate with sequences that are variable in length. This analysis suggests that the evolution of the present-day filaggrins has been constrained by maintenance of phosphorylation sites and overall amino acid composition. The cDNAs for the profilaggrins are similar in structure, reflecting genes that have simple repeating structures and lack introns within their coding regions. Mouse and rat profilaggrin terminate with a nonpolar sequence atypical of the rest of the coding region, and have similar 3' noncoding regions. To explain these observations, a novel evolutionary model is proposed.  相似文献   

4.
Organization, structure, and polymorphisms of the human profilaggrin gene   总被引:8,自引:0,他引:8  
Profilaggrin is a major protein component of the keratohyalin granules of mammalian epidermis. It is initially expressed as a large polyprotein precursor and is subsequently proteolytically processed into individual functional filaggrin molecules. We have isolated genomic DNA and cDNA clones encoding the 5'- and 3'-ends of the human gene and mRNA. The data reveal the presence of likely "CAT" and "TATA" sequences, an intron in the 5'-untranslated region, and several potential regulatory sequences. While all repeats are of the same length (972 bp, 324 amino acids), sequences display considerable variation (10-15%) between repeats on the same clone and between different clones. Most variations are attributable to single-base changes, but many also involve changes in charge. Thus, human filaggrin consists of a heterogeneous population of molecules of different sizes, charges, and sequences. However, amino acid sequences encoding the amino and carboxyl termini are more conserved, as are the 5' and 3' DNA sequences flanking the coding portions of the gene. The presence of unique restriction enzyme sites in these conserved flanking sequences has enabled calculations on the size of the full-length gene and the numbers of repeats in it: depending on the source of genomic DNA, the gene contains 10, 11, or 12 filaggrin repeats that segregate in kindred families by normal Mendelian genetic mechanisms. This means that the human profilaggrin gene system is also polymorphic with respect to size due to simple allelic differences between different individuals. The amino- and carboxyl-terminal sequences of profilaggrin contain partial or truncated repeats with unusual un-filaggrin-like sequences on the termini.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

5.
Two functional cytosolic thymidine kinase (tk) cDNA clones were isolated from a mouse L-cell library. An RNA blot analysis indicated that one of these clones contains a nearly full-length tk sequence and that LTK- cells contain little or no TK message. The nucleotide sequences of both clones were determined, and the functional mouse tk cDNA contains 1,156 base pairs. An analysis of the sequence implied that there is an untranslated 32-nucleotide region at the 5' end of the mRNA, followed by an open reading frame of 699 nucleotides. The 3' untranslated region is 422 nucleotides long. Thus, the gene codes for a protein containing 233 amino acids, with a molecular weight of 25,873. A comparison of the coding sequences of the mouse tk cDNA with the human and chicken tk genes revealed about 86 and 70% homology, respectively. We also isolated the tk gene from a mouse C57BL/10J cosmid library. The structural organization was determined by restriction mapping, Southern blotting, and heteroduplex analysis of the cloned sequences, in combination with a mouse tk cDNA. The tk gene spans approximately 11 kilobases and contains at least five introns. Southern blot analysis revealed that this gene is deleted in mouse LTK- cells, consistent with the inability of these cells to synthesize TK message. This analysis also showed that tk-related sequences are present in the genomes of several mouse strains, as well as in LTK- cells. These segments may represent pseudogenes.  相似文献   

6.
Structure of the murine complement factor H gene   总被引:3,自引:0,他引:3  
Factor H is a regulatory protein of the alternative pathway of complement activation comprised of 20 tandem repeating units of 60 amino acids each. A factor H cDNA clone was used to identify 17 genomic clones from a cosmid library. Four clones were selected for analysis of intron/exon junctions and 5' and 3' regions of the gene and for mapping of the exons. The factor H gene was found to be comprised of 22 exons. Each repeating unit is encoded by one exon, except the second repeat, which is coded by two exons; the leader sequence is encoded by a separate exon. The exons range in size from 77 to 210 base pairs (bp) and average 178 bp. They span a region of approximately 100 kilobases (kb) on chromosome 1. The leader sequence exon is 26 kb upstream of the first repeat exon, representing the largest intron. The other introns range in size from 86 bp to 12.9 kb, and the average intron size is 4.7 kb. Analysis of the genomic organization of the factor H gene has provided insight into the protein structure and will enable the construction of deletion mutants for functional studies.  相似文献   

7.
The gene encoding the circumsporozoite protein (CSP) from the rodent malaria parasite, Plasmodium yoelii, has been cloned and the nucleotide sequence has been determined. The gene encodes a protein of 367 amino acids as deduced from the nucleotide sequence. This gene is structurally similar to other Plasmodium spp. CSP genes in that it contains putative hydrophobic signal and anchor sequences at the NH2 and COOH termini, respectively, two small regions (Regions I and II) that are conserved in all CSP genes analyzed to date, and a central region containing the immunodominant repeating peptide sequence. Unlike other CSP genes, however, the immunodominant repeat region of the gene is composed of two distinctly different types of tandem repeats. One repeating unit is six amino acids (Gln-Gly-Pro-Gly-Ala-Pro) in length while the other is only four (Gln-Gln-Pro-Pro) residues long. A synthetic peptide, Gln-Gly-Pro-Gly-Ala-Pro X 3, strongly inhibits the binding of anti-CSP monoclonal antibody to sporozoite antigens while another peptide, Gln-Gln-Pro-Pro X 4, weakly inhibits the binding of this same antibody to sporozoite antigens. This work should allow the construction of a mouse model system to parallel human vaccine trials.  相似文献   

8.
9.
10.
p36 is a major substrate of both viral and growth factor receptor associated protein kinases. This protein has recently been named calpactin I heavy chain since it is the large subunit of a Ca2(+)-dependent phospholipid and actin binding heterotetramer. The primary structure of p36 has been determined from analysis of cloned cDNA. The protein contains 338 amino acids, has an approximate molecular weight of 39,000, and is comprised of several distinct domains, including four 75 amino acid repeats. From two overlapping cosmid clones isolated from different mouse genomic liver libraries, the complete intron/exon structure of the p36 gene was determined and the 5' and 3' noncoding regions of the gene were analyzed. The coding and 3' untranslated region of the p36 gene contains 12 exons which range in size from 48 to 322 base pairs (bp) with an average size of 107 bp. The repeat structures found at the protein level are not delineated by single exons, but the N-terminal p11-binding domain is encoded by a single exon. Structural mapping of the gene demonstrated that the lengths of the first two introns in the coding region are together approximately 6 kilobases (kb), while the other introns range in size from 600 to 3600 bp with an average size of 1650 bp. The p36 gene is at least 22 kb in length and has a coding sequence of approximately 1 kb, representing only 4.5% of the gene.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

11.
Human plasma carboxypeptidase N is a 280-kDa tetramer with two high molecular mass (83-kDa) glycosylated subunits which protect the two 50-kDa catalytic subunits and keep them in the circulation. An initial clone for the 83-kDa subunit was obtained by screening two lambda gt11 human liver cDNA expression libraries with antiserum specific for carboxypeptidase N or the 83-kDa subunit. The libraries were rescreened with the labeled cloned cDNA, and the largest clone obtained (2536-base pair insert) was completely sequenced. The deduced protein sequence matched the sequence of several tryptic peptides from the 83-kDa subunit but did not contain the NH2-terminal sequence. The remaining portion of the protein coding sequence was synthesized by the polymerase chain reaction, cloned, and sequenced. The composite cDNA sequence is 2870 base pairs long with an open reading frame of 1608 base pair coding for a protein of 536 amino acids (Mr = 58,762). The protein sequence contains seven potential N-linked glycosylation sites and a threonine/serine-rich region which is a potential site for attachment of O-linked carbohydrate. The most striking feature is a region (residues 68-355) containing 12 leucine-rich tandem repeats of 24 residues with the following consensus sequence: P-X-X-alpha-F-X-X-L-X-X-L-X-X-L-X-L-X-X-N-X-L-X-X-L (X = any amino acid and alpha = aliphatic amino acids, I, L, or V). This repeating pattern is found in the leucine-rich alpha 2-glycoprotein and in other proteins where it might mediate interactions with macromolecules. This region also contains five sequences with heptad repeating leucine residues comprising a leucine zipper motif. The leucine-rich domain likely constitutes an important structural or functional element in the interactions of the 83- and 50-kDa subunits to form the active tetramer of carboxypeptidase N.  相似文献   

12.
We have used antibodies to the basement membrane proteoglycan to screen lambda gt11 expression vector libraries and have isolated two cDNA clones, termed BPG 5 and BPG 7, which encode different portions of the core protein of the heparan sulfate basement membrane proteoglycan. These clones hybridize to a single mRNA species of approximately 12 kilobases. Amino acid sequences obtained on peptides derived from protease digests of the core protein were found in the deduced sequence, confirming the identity of these clones. BPG 5 spanned 1986 base pairs and has an open reading frame of 662 amino acids. The amino acid sequence deduced from BPG 5 contains two cysteine-rich domains and two internally homologous domains lacking cysteine. The cysteine-rich domains show homology to the cysteine-rich domains of the laminin chains. A globule-rod structure, similar to that of the short arms of the laminin chains, is proposed for this region of the proteoglycan. The other clone, BPG 7, is 2193 base pairs long and has an open reading frame of 731 amino acids. The deduced sequence contains eight internal repeats with 2 cysteine residues in each repeat. These repeats show homology to the neural-cell adhesion molecule N-CAM and the plasma alpha 1B-glycoprotein. Looping structures similar to these proteins and to other proteins of the immunoglobulin gene superfamily are proposed for this region of the proteoglycan. The sequence DSGEY was found four times in this domain and could be heparan sulfate attachment sites.  相似文献   

13.
There exist in the Xenopus laevis genome clusters of tandemly repeated DNA sequences, consisting of two types of 393-base-pair repeating unit. Each such cluster contains several units of one of these paired tandem repeats (PTR-1), followed by several units of the other repeat (PTR-2). The number of repeats of each type is variable from cluster to cluster and averages about seven of each type per cluster. Every cluster has ca. 1,000 base pairs of common left flanking sequence (adjacent to the PTR-1 repeats) and 1,000 base pairs of common right flanking sequence (adjacent to the PTR-2 repeats). Beyond these common flanks, the DNA sequences are different in the eight cloned genomic fragments we have studied. Thus, the hundreds of PTR clusters in the genome are dispersed at apparently unrelated sites. Nucleotide sequences of representative PTR-1 and PTR-2 repeats are 64% homologous. These sequences do not reveal an obvious function. However, the related species X. mulleri and X. borealis have sequences homologous to PTR-1 and PTR-2, which show the same repeat lengths and genomic organization. This evolutionary conservation suggests positive selection for the clusters. Maintenance of these sequences at dispersed sites imposes constraints on possible mechanisms of concerted evolution.  相似文献   

14.
15.
A 34,000-Da protein (P34) is one of the four major soybean oil body proteins observed by sodium dodecyl sulfate-polyacrylamide gel electrophoresis of isolated organic solvent-extracted oil bodies from mature seeds. P34 is processed during seedling growth to a 32,000-Da polypeptide (P32) by the removal of an amino-terminal decapeptide (Herman, E.M., Melroy, D.L., and Buckhout, T.J. (1990) Plant Physiol, in press). A soybean lambda ZAP II cDNA library constructed from RNA isolated from midmaturation seeds was screened with monoclonal antibodies directed against two different epitopes of P34. The isolated cDNA clone encoding P34 contains 1,350 base pairs terminating in a poly(A)+ tail and an open reading frame 1,137 base pairs in length. The open reading frame includes a deduced amino acid sequence which matches 23 of 25 amino-terminal amino acids determined by automated Edman degradation of P34 and P32. The cDNA predicts a mature protein of 257 amino acids and of 28,641 Da. The open reading frame extends 5' from the known amino terminus of P34 encoding a possible precursor and signal sequence segments with a combined additional 122 amino acids. Prepro-P34 is deduced to be a polypeptide of 42,714 Da, indicating that the cDNA clone apparently encodes a polypeptide of 379 amino acids. A comparison of the nucleotide and deduced amino acid sequences in the GenBank Data Bank with the sequence of P34 has shown considerable sequence similarity to the thiol proteases of the papain family. Southern blot analysis of genomic DNA indicated that the P34 gene has a low copy number.  相似文献   

16.
J Trowsdale  A Kelly  J Lee  S Carson  P Austin  P Travers 《Cell》1984,38(1):241-249
Three overlapping cosmid clones contain coding sequences for four HLA Class II genes, provisionally identified as two HLA-SB alpha and two HLA-SB beta genes. The genes are in the order beta, alpha, beta, alpha, inverted with respect to each other. One of the SB beta genes contains a 513 bp sequence that appears to be a processed pseudogene, flanked by direct 17 bp repeat sequences, in the intron upstream of the beta 1 exon. The pseudogene is homologous to a family of sequences of approximately 25-40 members, most of which are not on chromosome 6. A cDNA clone, highly homologous to the pseudogene, except for its 5' end, contains a normal poly(A) addition site and a poly(A) tail. The cDNA clone is homologous to a single-copy gene in both man and mouse, encoded on human chromosome 15. A search of published DNA sequences identified a mouse sequence, with about 77% similarity to the pseudogene sequence, in the negative strand of an intron in a mouse dihydrofolate reductase gene. The second SB beta gene does not contain the pseudogene sequence.  相似文献   

17.
We isolated DNA clones of intracisternal A-particle (IAP) genes from the genome of an Asian wild mouse, Mus caroli. A typical M. caroli IAP gene was 6.5 kilobase pairs in length and had long terminal repeat (LTR) sequences at both ends. The size of the LTR was 345 base pairs in clone L20, and two LTRs at both ends of this clone were linked to directly repeating cellular sequences of 6 base pairs. Each LTR possessed most of the structural features commonly associated with the retrovirus LTR. The restriction map of the M. caroli IAP gene resembled that of Mus musculus, although the M. caroli IAP gene was 0.4 kilobase pairs shorter than the M. musculus IAP gene in two regions. Sequence homology between the M. caroli and M. musculus IAP LTRs was calculated as about 80%, whereas the LTR sequence of the Syrian hamster IAP gene was about 60% homologous to the M. caroli LTR. The reiteration frequency of the M. caroli IAP genes was estimated as 200 to 400 copies per haploid genome, which is at least 10 times the reported value. These results suggest that the IAP genes observed in the genus Mus are present in multiple copies with structures closely resembling the integrated retrovirus gene.  相似文献   

18.
The cellular nucleotide sequences flanking the mouse intracisternal A-particle gene 81 were determined. The results indicated that they were made of many small oligonucleotide repeats both direct and indirect in orientation. These two different kinds of repeating sequences were often found to be overlap. The overall base composition of this region is relatively A + T rich. The most important feature of the sequences determined was that it consists of several repeated dinucleotide tracts containing a (CA)16 repeating cluster in the 5' end flanking region of one strand and another repeating dinucleotide cluster, (GT)16, in the 3' end flanking region of the same strand of this gene. In addition, the existence of two clusters of 9 continuous 5-bp repeat units, GCTTT, was found in the 3' end flanking region. The possible roles of such repeating sequence were discussed.  相似文献   

19.
Multigene families encode the proline-rich proteins that are so prominent in human saliva and are dramatically induced in mouse and rat salivary glands by isoproterenol treatment and by feeding tannins. A cDNA encoding an acidic proline-rich protein of rat has been sequenced (Ziemer, M. A., Swain, W. F., Rutter, W. J., Clements, S., Ann, D. K., and Carlson D. M. (1984) J. Biol. Chem. 259, 10475-10480). This study presents the nucleotide sequences of five additional proline-rich protein cDNAs complementary to both mouse and rat parotid and submandibular gland mRNAs. Amino acid compositions deduced from the nucleotide sequences are typical for proline-rich proteins: 25-45% proline, 18-22% glycine, and 18-22% glutamine and generally an absence of sulfur-containing amino acids except for the initiator methionine. These proline-rich proteins display unusual repeating peptide sequences of 14-19 amino acids. The derived amino acid sequence of the cDNA insert of plasmid pMP1 from mouse has a 19-amino acid sequence which is repeated four times. The inserts of plasmids pUMP40 and pUMP4 also from mouse encode for 12 and 11 repeats of a 14-amino acid peptide, respectively. These repetitive sequences, and others from rat and mouse cDNAs and from human genomic clones, all show very high homologies and likely evolved from duplication of internal portions of an ancestral gene. Gene conversion could account for the high degree of conservation of nucleotide sequences of the repeat regions. Protein derived from the nucleotide sequences are all characterized by four general regions: a putative signal peptide, a transition region, the repetitive region, and a carboxyl-terminal region. The 5'-flanking sequences and sequences encoding the putative signal peptides are highly conserved (greater than 94%) in all six cDNAs. This sequence conservation may be important in the regulation of the biosynthesis of these unusual proteins.  相似文献   

20.
Genomic DNA containing the protein coding region for Drosophila cAMP-dependent protein kinase catalytic subunit has been cloned and sequenced. The probe used to detect and isolate the gene fragment was constructed from two partially complementary synthetic oligonucleotides and contains 60 base pairs that encode (using Drosophila codon preferences) amino acids 195-214 of the beef heart catalytic subunit. In reduced stringency hybridization conditions, the probe recognizes two target sites in fly genomic DNA with 85% homology. One of these sites is in the cAMP-dependent protein kinase catalytic subunit gene, which was isolated as a 3959-base pair HindIII fragment. This fragment contains all of the protein coding portion, 900 base pairs upstream of the initiator ATG, and 2000 base pairs downstream of the termination codon (TAG). The coding portion of the gene contains no introns and yields a protein of 352 amino acids. There is a 2-amino acid insertion near the N terminus of the fly protein relative to the beef and mouse enzymes. Of the remaining 350 amino acids, 273 are invariant in the three species. A probe derived from the coding sequence of the HindIII clone hybridizes strongly to a 5100-base poly(A)+ RNA and weakly to 4100- and 3400-base poly(A)+ RNAs expressed in adult flies. A 2100-base pair EcoRI genomic fragment containing the second site recognized by the 60-base pair probe has also been cloned. DNA sequence analysis demonstrates that this fragment is part of the cGMP-dependent protein kinase gene or a close homolog. The catalytic subunit gene and the cGMP-dependent protein kinase gene have been located in regions 30C and 21D, respectively, of chromosome 2.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号