首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Nucleotide sequence of the gene for human factor IX (antihemophilic factor B)   总被引:97,自引:0,他引:97  
Two different human genomic DNA libraries were screened for the gene for blood coagulation factor IX by employing a cDNA for the human protein as a hybridization probe. Five overlapping lambda phages were identified that contained the gene for factor IX. The complete DNA sequence of about 38 kilobases for the gene and the adjacent 5' and 3' flanking regions was established by the dideoxy chain termination and chemical degradation methods. The gene contained about 33.5 kilobases of DNA, including seven introns and eight exons within the coding and 3' noncoding regions of the gene. The eight exons code for a prepro leader sequence and 415 amino acids that make up the mature protein circulating in plasma. The intervening sequences range in size from 188 to 9473 nucleotides and contain four Alu repetitive sequences, including one in intron A and three in intron F. A fifth Alu repetitive sequence was found immediately flanking the 3' end of the gene. A 50 base pair insert in intron A was found in a clone from one of the genomic libraries but was absent in clones from the other library. Intron A as well as the 3' noncoding region of the gene also contained alternating purine-pyrimidine sequences that provide potential left-handed helical DNA or Z-DNA structures for the gene. KpnI repetitive sequences were identified in intron D and the region flanking the 5' end of the gene. The 5' flanking region also contained a 1.9-kb HindIII subfamily repeat. The seven introns in the gene for factor IX were located in essentially the same position as the seven introns in the gene for human protein C, while the first three were found in positions identical with those in the gene for human prothrombin.  相似文献   

2.
Nucleotide sequence of the gene for human prothrombin   总被引:23,自引:0,他引:23  
S J Degen  E W Davie 《Biochemistry》1987,26(19):6165-6177
A human genomic DNA library was screened for the gene coding for human prothrombin with a cDNA coding for the human protein. Eighty-one positive lambda phage were identified, and three were chosen for further characterization. These three phage hybridized with 5' and/or 3' probes prepared from the prothrombin cDNA. The complete DNA sequence of 21 kilobases of the human prothrombin gene was determined and included a 4.9-kilobase region that was previously sequenced. The gene for human prothrombin contains 14 exons separated by 13 intervening sequences. The exons range in size from 25 to 315 base pairs, while the introns range from 84 to 9447 base pairs. Ninety percent of the gene is composed of intervening sequence. All the intron splice junctions are consistent with sequences found in other eukaryotic genes, except for the presence of GC rather than GT on the 5' end of intervening sequence L. Thirty copies of Alu repetitive DNA and two copies of partial KpnI repeats were identified in clusters within several of the intervening sequences, and these repeats represent 40% of the DNA sequence of the gene. The size, distribution, and sequence homology of the introns within the gene were then compared to those of the genes for the other vitamin K dependent proteins and several other serine proteases.  相似文献   

3.
Two human gamma-crystallin genes are linked and riddled with Alu-repeats   总被引:7,自引:0,他引:7  
A human genomic cosmid clone, pHcos gamma-1, has been isolated containing two closely linked gamma-crystallin genes, oriented in the same direction. The sequence of these genes and their 5' and 3' flanking regions has been determined. The coding regions of both genes are interrupted by two introns. The first introns (94 and 100 bp, respectively) are located in the 5' region of the genes. The second introns (2.82 and 0.95 kb, respectively) divide the genes into two halves, each encoding a structural domain of the gamma-crystallin protein. The coding regions of the two genes show 80% homology. Due to a mutation in the splice acceptor site of the second intron of the first gene, the coding region of its third exon is 3 bp longer than that of the second gene. In the flanking regions several conserved sequence elements were found, including those elements that are known to be necessary for the correct expression of eukaryotic genes. The flanking and intronic regions of the genes contain 'simple sequence' DNA and Alu repeats. The Alu repeats are usually clustered, contain truncated elements, and are often located near simple sequence DNA.  相似文献   

4.
5.
The gene responsible for cystic fibrosis, the most common severe autosomal recessive disorder, is located on the long arm of human chromosome 7, region q31-q32. The gene has recently been identified and shown to be approximately 250 kb in size. To understand the structure and to provide the basis for a systematic analysis of the disease-causing mutations in the gene, genomic DNA clones spanning different regions of the previously reported cDNA were isolated and used to determine the coding regions and sequences of intron/exon boundaries. A total of 22,708 bp of sequence, accounting for approximately 10% of the entire gene, was obtained. Alignment of the genomic DNA sequence with the cDNA sequence showed perfect colinearity between the two and a total of 27 exons, each flanked by consensus splice signals. A number of repetitive elements, including the Alu and Kpn families and simple repeats, such as (GT)17, (GATT)7, and (TA)14, were detected in close vicinity of some of the intron/exon boundaries. At least three of the simple repeats were found to be polymorphic in the population. Although an internal amino acid sequence homology could be detected between the two halves of the predicted polypeptide, especially in the regions of the two putative nucleotide-binding folds (NBF1 and NBF2), the lack of alignment of the nucleotide sequence as well as the different positions of the exon/intron boundaries does not seem to support the hypothesis of a recent gene duplication event. To facilitate detection of mutations by direct sequence analysis of genomic DNA, 28 sets of oligonucleotide primers were designed and tested for their ability to amplify individual exons and the immediately flanking sequences in the introns.  相似文献   

6.
The human alpha-fetoprotein gene spans 19,489 base pairs from the putative "Cap" site to the polyadenylation site. It is composed of 15 exons separated by 14 introns, which are symmetrically placed within the three domains of alpha-fetoprotein. In the 5' region, a putative TATAAA box is at position -21, and a variant sequence, CCAAC, of the common CAT box is at -65. Enhancer core sequences GTGGTTTAAAG are found in introns 3 and 4, and several copies of glucocorticoid response sequences AGATACAGTA are found on the template strand of the gene. There are six polymorphic sites within 4690 base pairs of contiguous DNA derived from two allelic alpha-fetoprotein genes. This amounts to a measured polymorphic frequency of 0.13%, or 6.4 X 10(-4)/site, which is about 5-10 times lower than values estimated from studies on polymorphic restriction sites in other regions of the human genome. There are four types of repetitive sequence elements in the introns and flanking regions of the human alpha-fetoprotein gene. At least one of these is apparently a novel structure (designated Xba) and is found as a pair of direct repeats, with one copy in intron 7 and the other in intron 8. It is conceivable that within the last 2 million years the copy in intron 8 gave rise to the repeat in intron 7. Their present location on both sides of exon 8 gives these sequences a potential for disrupting the functional integrity of the gene in the event of an unequal crossover between them. There are three Alu elements, one of which is in intron 4; the others are located in the 3' flanking region. A solitary Kpn repeat is found in intron 3. The Xba and Kpn repeats were only detected by complete sequencing of the introns. Neither X, Xba, nor Kpn elements are present in the related human albumin gene, whereas Alu's are present in different positions. From phylogenetic evidence, it appears that Alu elements were inserted into the alpha-fetoprotein gene at some time postdating the mammalian radiation 85 million years ago.  相似文献   

7.
D Jenne  K K Stanley 《Biochemistry》1987,26(21):6735-6742
The S-protein/vitronectin gene was isolated from a human genomic DNA library, and its sequence of about 5.3 kilobases including the adjacent 5' and 3' flanking regions was established. Alignment of the genomic DNA nucleotide sequence and the cDNA sequence indicated that the gene consisted of eight exons and seven introns. The intron positions in the S-protein gene and their phase type were compared to those in the hemopexin gene which shares amino acid sequence homologies with transin and the S-protein. Three introns have been found at equivalent positions; two other introns are very close to these positions and are interpreted as cases of intron sliding. Introns 3-7 occur at a conserved glycine residue within repeating peptide segments, whereas introns 1 and 2 are at the boundaries of the Somatomedin B domain of S-protein. The analysis of the exon structure in relation to repeating peptide motifs within the S-protein strongly suggests that it contains only seven repeats, one less than the hemopexin molecule. A very similar repeat pattern like that in hemopexin is shown to be present also in two other related proteins, transin and interstitial collagenase. An evolutionary model for the generation of the repeat pattern in the S-protein and the other members of this novel "pexin" gene family is proposed, and the sequence modifications for some of the repeats during divergent evolution are discussed in relation to known unique functional properties of hemopexin and S-protein.  相似文献   

8.
The structures of the termini and their flanking regions of two human KpnI family members were investigated. The two differed in length, but the starting sequence at one terminal (defined as the 5' terminal) was found to be common to both members. The Alu family sequence was found in the 5' flanking regions. The KpnI family sequence started several base-pairs downstream from the 3' end of the Alu family sequence. In both cases, the Alu family sequence was not flanked by the direct repeat sequence common to the Alu family. These two members showed no sequence homology in 3' terminal regions. Interestingly, the Alu family plus the KpnI family unit was found to be flanked by a direct repeat sequence of several base-pair length. Based on these findings, relationship between the Alu family and KpnI family is discussed.  相似文献   

9.
The sequence of the gorilla alpha-fetoprotein gene, including 869 base pairs of the 5' flanking region and 4892 base pairs of the 3' flanking region (24,607 in total), was determined from two overlapping lambda phage clones. The sequence extends 18,846 base pairs from the Cap site to the polyadenylation site, and it reveals that the gene is composed of 15 exons, which are symmetrically placed within three domains of alpha-fetoprotein. The deduced polypeptide chain is composed of a 19-amino-acid leader peptide, followed by 590 amino acids of the mature protein. The RNA polymerase II binding site, TATAAAA, and the promoter element, CCAAC, are positioned at -21 and -65 from the Cap site, respectively. The polyadenylation signal, AATAAA, is located in the last exon, which is untranslated. The sequence for the gorilla alpha-fetoprotein gene was compared with that of the previously published human alpha-fetoprotein gene (P. E. M. Gibbs, R. Zielinski, C. Boyd, and A. Dugaiczyk, 1987, Biochemistry 26: 1332-1343). Four types of repetitive sequence elements were found in identical positions in both species. However, one Alu and one Xba DNA repeat within introns 4 and 7, respectively, of the human gene are absent from orthologous positions in the gorilla. The Alu and the Xba DNA repeats probably emerged in the human genome after the human/gorilla divergence and became established novelties in the human lineage. There are 363/21,523 mutational changes between human and gorilla, amounting to 1.69% DNA divergence between the two primate species. The value of 1.69% is lower than the 2.27% obtained from melting temperatures of hybrids between human and gorilla genomic DNA (C. G. Sibley and J. E. Ahlquist, 1984, J. Mol. Evol. 26: 99-121). At the protein level, Homo sapiens differs from Gorilla gorilla only at 4 of 609 amino acid positions (0.66%) in the alpha-fetoprotein sequence. This difference signifies a lower rate of molecular divergence for the alpha-fetoprotein gene in primates, as compared to rodents.  相似文献   

10.
The human alpha-fetoprotein (AFP) gene was isolated into three overlapping clones in bacteriophage lambda vectors and its sequence organization analyzed by restriction endonuclease mapping and nucleotide sequencing. The human AFP gene is about 20 kilobase pairs long and contains 15 exons and 14 introns. The overall organization of the human AFP gene is similar to that of the mouse AFP gene, with all but two exons showing identical sizes. Nucleotide sequences at all exon/intron junctions display similarity to the consensus boundary sequence (Breathnach, R., and Chambon, P. (1981) Annu. Rev. Biochem. 50, 349-383), with the GT-AG rule applied to the splicing point. The cap site maps 44 nucleotides upstream from the translation initiation site. The "TATA box" is located 27 nucleotides upstream from the putative cap site and is flanked by sequences with dyad symmetry. The TATA box can thus be placed in the loop portion of a possible stem-loop structure formed by intrastrand base-pairing. Other characteristic nucleotide sequences in the 5' flanking region include a CCAAC pentamer, a 14-base pair (bp) enhancer-like sequence, and a 9-bp sequence homologous to the glucocorticoid responsive element. A long (90 bp) direct repeat and several alternating purine/pyrimidine sequences are also present in the 5' flanking region. A 736-bp sequence of the 5' flanking region adjacent to the cap site of the human AFP gene shows a 61% similarity with the corresponding region of the mouse AFP gene. There are two Alu family sequences and two poly(dT-dG) repeats in the human AFP gene that show different distribution patterns from those in the mouse AFP gene.  相似文献   

11.
Sequence of the cDNA and gene for angiogenin, a human angiogenesis factor   总被引:29,自引:0,他引:29  
Human cDNAs coding for angiogenin, a human tumor derived angiogenesis factor, were isolated from a cDNA library prepared from human liver poly(A) mRNA employing a synthetic oligonucleotide as a hybridization probe. The largest cDNA insert (697 base pairs) contained a short 5'-noncoding sequence followed by a sequence coding for a signal peptide of 24 (or 22) amino acids, 369 nucleotides coding for the mature protein of 123 amino acids, a stop codon, a 3'-noncoding sequence of 175 nucleotides, and a poly(A) tail. The gene coding for human angiogenin was then isolated from a genomic lambda Charon 4A bacteriophage library employing the cDNA as a probe. The nucleotide sequence of the gene and the adjacent 5'- and 3'-flanking regions (4688 base pairs) was then determined. The coding and 3'-noncoding regions of the gene for human angiogenin were found to be free of introns, and the DNA sequence for the gene agreed well with that of the cDNA. The gene contained a potential TATA box in the 5' end in addition to two Alu repetitive sequences immediately flanking the 5' and 3' ends of the gene. The third Alu sequence was also found about 500 nucleotides downstream from the Alu sequence at the 3' end of the gene. The amino acid sequence of human angiogenin as predicted from the gene sequence was in complete agreement with that determined by amino acid sequence analysis. It is about 35% homologous with human pancreatic ribonuclease, and the amino acid residues that are essential for the activity of ribonuclease are also conserved in angiogenin. This provocative finding is thought to have important physiological implications.  相似文献   

12.
The complete sequence of a functionally expressed human beta-tubulin gene (5 beta) is presented. The amino acid sequence encoded by this gene constitutes a distinct isotype, differing from a previously described human beta-tubulin sequence at 21 positions throughout the polypeptide chain. The beta-tubulin coding sequence in 5 beta is interrupted by three intervening sequences of 1014, 117 and 4826 nucleotides. The largest of these contains ten members of the Alu family of middle repetitive sequences. Together, these regions account for sixty percent of this intervening sequence. Two of the Alu elements are juxtaposed head to tail, and share the same flanking direct repeat. The ten Alu sequences are substantially divergent, both from each other and from an Alu consensus sequence, and several contain deletions of up to half the entire sequence.  相似文献   

13.
We have determined the nucleotide sequence of the human plasminogen activator inhibitor-1 (PAI-1) gene and significant stretches of DNA which extend into its 5'-and 3'-flanking DNA regions; a total sequence of 15,867 base pairs (bp) is presented. The sequenced 5'-flanking DNA (1,520 bp) contains the essential eukaryotic cis-type proximal regulatory elements CCAAT and TATAA; the more distal 5'-flanking DNA region, as well as some introns, contain sequence elements which share identities with known eukaryotic enhancer elements. A major finding is the identification of a large region of shared nucleotides (comprising of about 520 bp) between the 5'-flanking DNAs of PAI-1 and tissue-type plasminogen activator genes. The length of the PAI-1 5'-untranslated region was found to be 145 bp as determined by nuclease analysis. The remaining PAI-1 structural gene consists of amino acid coding regions (containing a total of 1,206 bp, coding for the 23 amino acids of the signal peptide and 379 amino acids of the mature PAI-1 protein), 8 intron regions (a total of 8,978 bp), and a long 3'-untranslated region of about 1,800 bp which contains several polyadenylation sites. Two types of repetitive DNA elements are located within the PAI-1 structural gene and flanking DNAs: we have found 12 Alu elements and 5 repeats of a long poly (Pur) element. These Alu-Pur elements may represent a subset of the more abundant Alu family of repetitive sequence elements.  相似文献   

14.
Cytosine methylation at CpG dinucleotides is thought to cause more than one-third of all transition mutations responsible for human genetic diseases and cancer. We investigated the methylation status of the CpG dinucleotide at codon 248 in exon 7 of the p53 gene because this codon is a hot spot for inactivating mutations in the germ line and in most human somatic tissues examined. Codon 248 is contained within an HpaII site (CCGG), and the methylation status of this and flanking CpG sites was analyzed by using the methylation-sensitive enzymes CfoI (GCGC) and HpaII. Codon 248 and the CfoI and HpaII sites in the flanking introns were methylated in every tissue and cell line examined, indicating extensive methylation of this region in the p53 gene. Exhaustive treatment of an osteogenic sarcoma cell line, TE85, with the hypomethylating drug 5-aza-2'-deoxycytidine did not demethylate codon 248 or the CfoI sites in intron 6, although considerable global demethylation of the p53 gene was induced. Constructs containing either exon 7 alone or exon 7 and the flanking introns were transfected into TE85 cells to determine whether de novo methylation would occur. The presence of exon 7 alone caused some de novo methylation to occur at codon 248. More extensive de novo methylation of the CfoI sites in intron 6, which contains an Alu sequence, occurred in cells transfected with a vector containing exon 7 and flanking introns. With longer time in culture, there was increased methylation at the CfoI sites, and de novo methylation of codon 248 and its flanking HpaII sites was observed. These de novo-methylated sites were also resistant to 5-aza-2'-deoxycytidine-induced demethylation. The frequent methylation of codon 248 and adjacent Alu sequence may explain the enhanced mutability of this site as a result of the deamination of the 5-methylcytosine.  相似文献   

15.
The DNA sequence of the cob region of the Schizosaccharomyces pombe mitochondrial DNA has been determined. The cytochrome b structural gene is interrupted by an intron of 2526 base-pairs, which has an open reading frame of 2421 base-pairs in phase with the upstream exon. The position of the intron differs from those found in the cob genes of Saccharomyces cerevisiae, Aspergillus nidulans or Neurospora crassa. The Sch. pombe cob intron has the potential of assuming an RNA secondary structure almost identical to that proposed for the first two cox1 introns (group II) in S. cerevisiae and the p1-cox1 intron in Podospora anserina. It has most of the consensus nucleotides in the central core structure described for this group of introns and its comparison with other group II introns allows the identification of an additional conserved nucleotide stretch. A comparison of the predicted protein sequences of group II intronic coding regions reveals three highly conserved blocks showing pairwise amino acid identities of 34 to 53%. These regions comprise over 50% of the coding length of the intron but do not include the 5' region, which has strong secondary structural features. In addition to the potential intron folding, long helical structures involving repetitive sequences can be formed in the flanking cob exon regions. A comparison of the Sch. pombe cytochrome b sequence with those available from other organisms indicates that Sch. pombe is evolutionarily distant from both budding yeasts and filamentous fungi. As was seen for the Sch. pombe cox1 gene (Lang, 1984), the cob exons are translated using the universal genetic code and this distinguishes Sch. pombe mitochondria from all other fungal and animal mitochondrial systems.  相似文献   

16.
The human albumin-alpha-fetoprotein genomic domain contains 13 repetitive DNA elements randomly distributed throughout the symmetrical structures of these genes. These repeated sequences are located at different sites within the two genes. The human albumin gene contains five Alu elements within four of its 14 intervening sequences. Two of these repeats are located in intron 2, and the remaining three are located in introns 7, 8, and 11. The human alpha-fetoprotein gene contains three of these Alu elements, one in intron 4 and the remaining two in the 3'-untranslated region. In addition, the human alpha-fetoprotein gene contains a Kpn repeat and two classes of novel repeats that are absent from the human albumin gene. Six of the Alu elements within the two genes are bound by short direct repeats that harbor five base substitutions in 120 possible positions (60 bp times 2 termini). The absence of Alu repeats from analogous positions in rodents indicates that these repeats invaded the albumin-alpha-fetoprotein domain less than 85 Myr ago (the time of mammalian radiation). Furthermore, considering the conservation of terminal repeats flanking the Alu sequences of the albumin-alpha-fetoprotein domain (0.042 changes per site), we submit that the average time of Alu insertion into this gene family could have been as recently as 15-30 Myr ago.  相似文献   

17.
The complete nucleotide sequence of the human apolipoprotein All gene together with 911 bases of 5' flanking sequence and 687 bases of 3' flanking sequence have been determined. The mRNA coding region is interrupted by three introns of 169, 293 and 395bp. The Intro-exon structure of the apo All gene is similar to that of the apo AI, apo CIII and apo E genes: three introns separate 4 coding sequences specifying the 5' untranslated region, pre-peptide, a short N-terminal domain and a C-terminal domain composed of a variable number of lipid-binding amphipathic helices. Intron II carries a 33bp dG-dT repetitive element adjacent to the 3' splice junction which has the potential to adopt the Z-DNA conformation. The 5' and 3' terminuses of the mRNA have been identified by primer extension and S1 nuclease mapping. A number of short direct repeats are found in the 5' flanking region and an inverted repeat occurs between the CAAT and TATA boxes. Downstream of the the gene is an Alu family repeat containing a polymorphic MspI site, the deletion of which is associated with increased circulating levels of apoAII. ApoAII gene expression was demonstrated in adult human liver and HepG2 cells but not in human small intestine. Of ten Rhesus monkey tissues examined apo All mRNA was detected only in liver.  相似文献   

18.
A K Jaiswal 《Biochemistry》1991,30(44):10647-10653
  相似文献   

19.
The complete nucleotide sequence and exon/intron structure of the rat embryonic skeletal muscle myosin heavy chain (MHC) gene has been determined. This gene comprises 24 X 10(3) bases of DNA and is split into 41 exons. The exons encode a 6035 nucleotide (nt) long mRNA consisting of 90 nt of 5' untranslated, 5820 nt of protein coding and 125 nt of 3' untranslated sequence. The rat embryonic MHC polypeptide is encoded by exons 3 to 41 and contains 1939 amino acid residues with a calculated Mr of 223,900. Its amino acid sequence displays the structural features typical for all sarcomeric MHCs, i.e. an amino-terminal "globular" head region and a carboxy-terminal alpha-helical rod portion that shows the characteristics of a coiled coil with a superimposed 28-residue repeat pattern interrupted at only four positions by "skip" residues. The complex structure of the rat embryonic MHC gene and the conservation of intron locations in this and other MHC genes are indicative of a highly split ancestral sarcomeric MHC gene. Introns in the rat embryonic gene interrupt the coding sequence at the boundaries separating the proteolytic subfragments of the head, but not at the head/rod junction or between the 28-residue repeats present within the rod. Therefore, there is little evidence for exon shuffling and intron-dependent evolution by gene duplication as a mechanism for the generation of the ancestral MHC gene. Rather, intron insertion into a previously non-split ancestral MHC rod gene consisting of multiple tandemly arranged 28-residue-encoding repeats, or convergent evolution of an originally non-repetitive ancestral MHC rod gene must account for the observed structure of the rod-encoding portion of present-day MHC genes.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号