首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Interactions at the 3' end of the intron initiate spliceosome assembly and splice site selection in vertebrate pre-mRNAs. Multiple factors, including U1 small nuclear ribonucleoproteins (snRNPs), are involved in initial recognition at the 3' end of the intron. Experiments were designed to test the possibility that U1 snRNP interaction at the 3' end of the intron during early assembly functions to recognize and define the downstream exon and its resident 5' splice site. Splicing precursor RNAs constructed to have elongated second exons lacking 5' splice sites were deficient in spliceosome assembly and splicing activity in vitro. Similar substrates including a 5' splice site at the end of exon 2 assembled and spliced normally as long as the second exon was less than 300 nucleotides long. U2 snRNPs were required for protection of the 5' splice site terminating exon 2, suggesting direct communication during early assembly between factors binding the 3' and 5' splice sites bordering an exon. We suggest that exons are recognized and defined as units during early assembly by binding of factors to the 3' end of the intron, followed by a search for a downstream 5' splice site. In this view, only the presence of both a 3' and a 5' splice site in the correct orientation and within 300 nucleotides of one another will stable exon complexes be formed. Concerted recognition of exons may help explain the 300-nucleotide-length maximum of vertebrate internal exons, the mechanism whereby the splicing machinery ignores cryptic sites within introns, the mechanism whereby exon skipping is normally avoided, and the phenotypes of 5' splice site mutations that inhibit splicing of neighboring introns.  相似文献   

3.
Nucleotide sequence of the gene for human factor IX (antihemophilic factor B)   总被引:97,自引:0,他引:97  
Two different human genomic DNA libraries were screened for the gene for blood coagulation factor IX by employing a cDNA for the human protein as a hybridization probe. Five overlapping lambda phages were identified that contained the gene for factor IX. The complete DNA sequence of about 38 kilobases for the gene and the adjacent 5' and 3' flanking regions was established by the dideoxy chain termination and chemical degradation methods. The gene contained about 33.5 kilobases of DNA, including seven introns and eight exons within the coding and 3' noncoding regions of the gene. The eight exons code for a prepro leader sequence and 415 amino acids that make up the mature protein circulating in plasma. The intervening sequences range in size from 188 to 9473 nucleotides and contain four Alu repetitive sequences, including one in intron A and three in intron F. A fifth Alu repetitive sequence was found immediately flanking the 3' end of the gene. A 50 base pair insert in intron A was found in a clone from one of the genomic libraries but was absent in clones from the other library. Intron A as well as the 3' noncoding region of the gene also contained alternating purine-pyrimidine sequences that provide potential left-handed helical DNA or Z-DNA structures for the gene. KpnI repetitive sequences were identified in intron D and the region flanking the 5' end of the gene. The 5' flanking region also contained a 1.9-kb HindIII subfamily repeat. The seven introns in the gene for factor IX were located in essentially the same position as the seven introns in the gene for human protein C, while the first three were found in positions identical with those in the gene for human prothrombin.  相似文献   

4.
The gene for the human glandular kallikrein, prostate-specific antigen, has been cloned. The sequence of 7130 nucleotides encompassing the gene and 633 bp of 5' and 639 bp of 3' flanking DNA has been determined. The translation initiation site was slightly heterogeneous, yielding 5' non-translated leader sequences of 41 and 35 bp. The gene is divided into five exons, with introns located at positions identical with those found in other glandular kallikrein genes. The nucleotide sequence is very similar to that of the human kallikrein gene hGK-1, with 76 to 93% of the nucleotides being identical in the exons and 76 to 87% in the introns. The similarity also extends approximately 200 bp into the sequence flanking the 5' end of hGK-1 and several other, both human and rodent, glandular kallikrein genes.  相似文献   

5.
Nucleotide sequence of the gene for the b subunit of human factor XIII   总被引:9,自引:0,他引:9  
R E Bottenus  A Ichinose  E W Davie 《Biochemistry》1990,29(51):11195-11209
Factor XIII (Mr 320,000) is a blood coagulation factor that stabilizes and strengthens the fibrin clot. It circulates in blood as a tetramer composed of two a subunits (Mr 75,000 each) and two b subunits (Mr 80,000 each). The b subunit consists of 641 amino acids and includes 10 tandem repeats of 60 amino acids known as GP-I structures, short consensus repeats (SCR), or sushi domains. In the present study, the human gene for the b subunit has been isolated from three different genomic libraries prepared in lambda phage. Fifteen independent phage with inserts coding for the entire gene were isolated and characterized by restriction mapping, Southern blotting, and DNA sequencing. The gene was found to be 28 kilobases in length and consisted of 12 exons (I-XII) separated by 11 intervening sequences. The leader sequence was encoded by exon I, while the carbonyl-terminal region of the protein was encoded by exon XII. Exons II-XI each coded for a single sushi domain, suggesting that the gene evolved through exon shuffling and duplication. The 12 exons in the gene ranged in size from 64 to 222 base pairs, while the introns ranged in size from 87 to 9970 nucleotides and made up 92% of the gene. The introns contained four Alu repetitive sequences, one each in introns A, E, I, and J. A fifth Alu repeat was present in the flanking 3' end of the gene. Two partial KpnI repeats were also found in the introns, including one in intron I and one in intron J. The KpnI repeat in intron J was 89% homologous to a sequence of approximately 2200 nucleotides flanking the gene coding for human beta globin and approximately 3800 nucleotides from the L1 insertion present in the gene for human factor VIII. Intron H also contained an "O" family repeat, while two potential regions for Z-DNA were identified within introns G and J. One nucleotide change was found in the coding region of the gene when its sequence was compared to that of the cDNA. This difference, however, did not result in a change in the amino acid sequence of the protein.  相似文献   

6.
S Y Shai  C Woodley-Miller  J Chao  L Chao 《Biochemistry》1989,28(13):5334-5343
Tissue kallikreins are a group of serine proteases which may function as peptide hormone processing enzymes. Two rat kallikrein genomic clones (RSKG-5 and RSKG-50) were sequenced and characterized. The rat tonin gene and a kallikrein-like gene were found in clones RSKG-5 and RSKG-50, respectively. The tonin gene is 4146 base pairs in length, with both the variant CCAAA and TTTAAA boxes in the 5'-end region and an AATAAA polyadenylation signal at the 3' end of the gene. It has five exons which are separated by four introns. Sequence analysis of 3.7-kb 5' upstream and 7.5-kb 3' downstream of the tonin gene failed to reveal a second kallikrein gene. Sequence comparisons of the RSKG-5 exons with tonin cDNA revealed that only one base in the 3'-noncoding region was different from that in the previously reported rat tonin cDNA. Characteristic TC- and TG-repeated sequences were also found in the first and second introns of the tonin gene. The tonin gene encodes a preprotonin of 259 amino acids (aa). The active enzyme consists of 235 aa and is preceded by a deduced signal peptide of 17 aa and a profragment of 7 aa. Northern blot analysis indicates that RSKG-5 is expressed in a sex-dependent manner in rat submandibular gland, with a higher level expressed in males. The RSKG-50 gene was truncated at an EcoRI site in the second intron, excluding its 5' end. Compared to the coding sequence of pancreatic kallikrein, 12 nucleotides have been deleted in exon 3 of the RSKG-50 gene. The nucleotide sequences of the third, fourth, and fifth exons of the RSKG-50 gene encode a polypeptide of 188 aa residues. The translated peptide is 80% homologous to rat pancreatic kallikrein and 75% homologous to rat tonin in the corresponding regions. Key residues in the RSKG-50 gene product indicate a serine protease with kallikrein-like cleavage specificity at basic amino acids.  相似文献   

7.
We report the structural organization of an 80 Kb segment of rat DNA, which encodes for about 40% of Thyroglobulin mRNA at the 3' end. The codogenic information included in this segment is splitted in 17 exons of homogeneous size (about 200 bp). The seven exons at the extreme 3' end have been precisely defined by DNA sequence analysis. No clear sequence homology is found among the exons, even though their coding capacity is quite similar, from 55 to 63 aminoacids residues. We located 2 hormonogenic (T4 forming) sites on the extreme 3' end of the gene in different exons. The DNA sequence coding for these functional sites shows a 70% homology in a 50 nucleotides segment. In addition we found a remnant of this sequence in other exons of the gene. Two large introns have been found on the 3' end of the gene: one is 17 Kb and the other one is more than 30 Kb long. On the basis of these findings and of preliminary studies on the remaining 5' end of the gene, we can predict that the minimum length of the rat TGB gene will be 150 Kb, which makes this gene the largest so far identified eukaryotic gene. We propose in addition that the 3' end exons arose by duplication of a common ancestor.  相似文献   

8.
Sequence of the cDNA and gene for angiogenin, a human angiogenesis factor   总被引:29,自引:0,他引:29  
Human cDNAs coding for angiogenin, a human tumor derived angiogenesis factor, were isolated from a cDNA library prepared from human liver poly(A) mRNA employing a synthetic oligonucleotide as a hybridization probe. The largest cDNA insert (697 base pairs) contained a short 5'-noncoding sequence followed by a sequence coding for a signal peptide of 24 (or 22) amino acids, 369 nucleotides coding for the mature protein of 123 amino acids, a stop codon, a 3'-noncoding sequence of 175 nucleotides, and a poly(A) tail. The gene coding for human angiogenin was then isolated from a genomic lambda Charon 4A bacteriophage library employing the cDNA as a probe. The nucleotide sequence of the gene and the adjacent 5'- and 3'-flanking regions (4688 base pairs) was then determined. The coding and 3'-noncoding regions of the gene for human angiogenin were found to be free of introns, and the DNA sequence for the gene agreed well with that of the cDNA. The gene contained a potential TATA box in the 5' end in addition to two Alu repetitive sequences immediately flanking the 5' and 3' ends of the gene. The third Alu sequence was also found about 500 nucleotides downstream from the Alu sequence at the 3' end of the gene. The amino acid sequence of human angiogenin as predicted from the gene sequence was in complete agreement with that determined by amino acid sequence analysis. It is about 35% homologous with human pancreatic ribonuclease, and the amino acid residues that are essential for the activity of ribonuclease are also conserved in angiogenin. This provocative finding is thought to have important physiological implications.  相似文献   

9.
10.
11.
12.
The human glucocerebrosidase gene and pseudogene: structure and evolution   总被引:36,自引:0,他引:36  
We report the sequence of the entire human gene encoding beta-glucocerebrosidase and that of the associated pseudogene. The gene contains 11 exons extending from base pair 355 to base pair 7232 in the overall sequence. The gene promoter contains TATA- and CAT-like boxes upstream of the major 5' end of the glucocerebrosidase RNA. The two TATA boxes lie between nucleotides (-23)-(-27) and (-33)-(-39) and the two possible CAT boxes reside between nucleotides (-90)-(-94) and (-96)-(-99) in relation to the major 5' end of the mRNA. The functionality of the promoter region was monitored by coupling it to the bacterial gene coding for chloramphenicol acetyltransferase (CAT) and assaying the expression of the enzyme in cells transfected with this vector. The glucocerebrosidase promoter not only directs synthesis of the bacterial enzyme but also exhibits the same pattern of tissue-specific expression as that of the endogenous gene. An apparently tightly linked pseudogene is approximately 96% homologous to the functional gene. However, introns 2, 4, 6, and 7 have large "deletions" consisting of Alu sequences 313, 626, 320, and 277 bp in length, respectively. It is entirely possible that the ancestral gene lacks these sequences and that they have been inserted into the introns of the functioning gene. There is also a 55-bp deletion from a part of exon 9 flanked by a short inverted repeat. The sequence data should facilitate development of methods for diagnosis of Gaucher disease at the molecular level.  相似文献   

13.
In virtually all of the 200 group I introns sequenced thus far, the specificity of 5' splice-site cleavage is determined by a basepair between a uracil base at the end of the 5' exon and a guanine in an intron guide sequence which pairs with the nucleotides flanking the splice-site. It has been reported that two introns in the cytochrome oxidase subunit I gene of Aspergillus nidulans and Podospora anserina are exceptions to this rule and have a C.G basepair in this position. We have confirmed the initial reports and shown for one of them that RNA editing does not convert the C to a U. Both introns autocatalytically cleave the 5' splice-site. Mutation of the C to U in one intron reduces the requirement for Mg2+ and leads to an increase in the rate of cleavage. As the C base encodes a highly conserved amino acid, we propose that it is selected post-translationally at the level of protein function, despite its inferior splicing activity.  相似文献   

14.
15.
16.
Nucleotide sequence of the gene for human prothrombin   总被引:23,自引:0,他引:23  
S J Degen  E W Davie 《Biochemistry》1987,26(19):6165-6177
A human genomic DNA library was screened for the gene coding for human prothrombin with a cDNA coding for the human protein. Eighty-one positive lambda phage were identified, and three were chosen for further characterization. These three phage hybridized with 5' and/or 3' probes prepared from the prothrombin cDNA. The complete DNA sequence of 21 kilobases of the human prothrombin gene was determined and included a 4.9-kilobase region that was previously sequenced. The gene for human prothrombin contains 14 exons separated by 13 intervening sequences. The exons range in size from 25 to 315 base pairs, while the introns range from 84 to 9447 base pairs. Ninety percent of the gene is composed of intervening sequence. All the intron splice junctions are consistent with sequences found in other eukaryotic genes, except for the presence of GC rather than GT on the 5' end of intervening sequence L. Thirty copies of Alu repetitive DNA and two copies of partial KpnI repeats were identified in clusters within several of the intervening sequences, and these repeats represent 40% of the DNA sequence of the gene. The size, distribution, and sequence homology of the introns within the gene were then compared to those of the genes for the other vitamin K dependent proteins and several other serine proteases.  相似文献   

17.
18.
A database of 209 Drosophila introns was extracted from Genbank (release number 64.0) and examined by a number of methods in order to characterize features that might serve as signals for messenger RNA splicing. A tight distribution of sizes was observed: while the smallest introns in the database are 51 nucleotides, more than half are less than 80 nucleotides in length, and most of these have lengths in the range of 59-67 nucleotides. Drosophila splice sites found in large and small introns differ in only minor ways from each other and from those found in vertebrate introns. However, larger introns have greater pyrimidine-richness in the region between 11 and 21 nucleotides upstream of 3' splice sites. The Drosophila branchpoint consensus matrix resembles C T A A T (in which branch formation occurs at the underlined A), and differs from the corresponding mammalian signal in the absence of G at the position immediately preceding the branchpoint. The distribution of occurrences of this sequence suggests a minimum distance between 5' splice sites and branchpoints of about 38 nucleotides, and a minimum distance between 3' splice sites and branchpoints of 15 nucleotides. The methods we have used detect no information in exon sequences other than in the few nucleotides immediately adjacent to the splice sites. However, Drosophila resembles many other species in that there is a discontinuity in A + T content between exons and introns, which are A + T rich.  相似文献   

19.
Two cDNA clones containing the complete protein-coding sequence of 1,188 nucleotides as well as the 5' and 3' non-coding regions of human prostatic acid phosphatase (PAP) were isolated and sequenced. The size of PAP mRNAs from benign prostate hyperplasia and cancerous prostate was estimated to be 3.2Kb, indicating that the 3' downstream polyadenylation signal was used. Several genomic clones containing parts of the human PAP gene were isolated and the nucleotide sequence of ten exons and their flanking regions was determined. The protein-coding sequence of the human PAP gene was interrupted by nine introns. The positions of all nine introns present in the human PAP gene were homologous to those of the first nine introns in the human lysosomal acid phosphatase (LAP) gene. However, the last (11th) exon of the LAP gene encoding the COOH-terminal domain, which includes a transmembrane segment, was found to be absent in human PAP gene. Southern blot analysis of ten mammalian genomic DNAs gave multiple EcoRI fragments. The data of human genomic DNAs were consistent with the total length of the PAP gene of at least 50 kilobases.  相似文献   

20.
A recombinant phage, SpC3, containing a 17 kb genomic DNA insert representing approximately 60% of the 3' portion of the sheep collagen alpha 2 gene, was evaluated by electron microscopic R loop analysis. A minimum of 17 intervening sequences (introns) and 18 alpha 2 coding sequences (exons) were mapped. With the exception of the 850 base pair exon located at the extreme 3' end of the insert, all exons contained 250 base pairs or less. The total length of all the exons in SpC3 was 3,014 base pairs. The length distribution of the 17 introns ranged from 300 to 1600 base pairs; together, all of the introns comprised 14,070 base pairs of SpC3 DNA. Thus, the DNA region required for coding the interspersed 3 kb of alpha 2 collagen genetic information was 5.6 fold longer than the corresponding alpha 2 mRNA coding sequences.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号