首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
Two linked genes, A1 and A2, coding for nearly identical isoforms of the acyl carrier protein (ACP) were isolated from an Arabidopsis thaliana (columbia) genomic library and sequenced. The amino acids deduced from the nucleotide sequence of the two genes indicate they encode distinct transit peptides, but the mature proteins are the same except for residue 79. Both genes are predicted to contain three introns in similar positions, although they differ in sequence and length. The introns interrupt regions coding for a) the transit peptide, b) the junction of the transit peptide and mature protein, and c) the highly conserved domain surrounding serine 38 to which the phosphopantetheine is attached. Primer extension analysis indicates that at least A1 is active in young plants.  相似文献   

2.
Sixty-five TAC (transformation-competent artificial chromosomes) clones were selected from a genomic library of Lotus japonicus accession MG-20 based on the sequence information of expressed sequences tags (ESTs), cDNA and gene information, and their nucleotide sequences were determined. The average insert size of the TAC clone was approximately 100 kb, and the total length of the sequenced regions in this study is 6,556,100 bp. Together with the nucleotide sequences of 56 TAC clones previously reported, the regions sequenced so far total 12,029,295 bp. By comparison with the sequences in protein and EST databases and by analysis with computer programs for gene modeling, a total of 711 potential protein-encoding genes with known or predicted functions, 239 gene segments and 90 pseudogenes were identified in the newly sequenced regions. The average gene density assigned so far was 1 gene/9140 bp. The average length of the assigned genes was 2.6 kb, which is considerably larger than that assigned in the Arabidopsis thaliana genome (1.9 kb for 6451 genes). Introns were identified in approximately 73% of the potential genes, and the average number and length of the introns per gene were 3.4 and 377 bp, respectively. Simple sequence repeat length polymorphism (SSLP) or derived cleaved amplified polymorphic sequence (dCAPS) markers were generated based on the nucleotide sequences of the genomic clones obtained, and each clone was mapped onto the linkage map using the F2 mapping population derived from a cross of two accessions of L. japonicus, Gifu B-129 and Miyakojima MG-20. The sequence data, gene information and mapping information are available through the World Wide Web at http://www.kazusa.or.jp/lotus/.  相似文献   

3.
The gene for prosaposin was characterized by sequence analysis of chromosomal DNA to gain insight into the evolution of this locus that encodes four highly conserved sphingolipid activator proteins or saposins. The 13 exons ranged in size from 57 to 1200 bp, while the introns were from 91 to 3812 bp in length. The regions encoding saposins A, B, and D each had three exons, while that for saposin C had only two. This sequence included the regions that encode the carboxy terminus of the signal peptide, the four mature prosaposin proteins, and the 3' untranslated region. Primer extension studies indicated that over 99% of the coding sequence was contained in these 19,985 bp. Use of PCR and reverse PCR techniques indicated that the most 5' coding approximately 140 bp contained large introns and at least two small exons. Analyses of the intronic positions in the saposin regions indicated that this gene evolved from an ancestral gene by two duplication events and at least one gene rearrangement involving a double crossover after introns had been inserted into the gene.  相似文献   

4.
In our ongoing project to deduce the nucleotide sequence of Arabidopsis thaliana chromosome 5, non-redundant P1 and TAC clones have been sequenced on the basis of the fine physical map, and as of January, 2000, the sequences of 16.6 Mb representing approximately 60% of chromosome 5 have been accumulated and released at our web site. Along with the sequence determination, structural features of the sequenced regions have been analyzed by applying a variety of computer programs, and we already predicted a total of 2697 potential protein coding genes in the 11,166,130 bp regions, which are covered by 159 P1 and TAC clones. In this paper, we describe the structural features of the 3,076,755 bp regions covered by newly analyzed 60 P1 and TAC clones. A total of 715 potential protein coding genes were identified, giving an average density of the genes identified of 1 gene per 4001 bp. Introns were observed in 80% of the genes, and the average number per gene and the average length of the introns were 4.5 and 147 bp, respectively. These sequence features are nearly identical to those in our latest report in which the data were compiled based on a new standard of gene assignment including the computer-predicted hypothetical genes. The regions also contained 12 tRNA genes when searched by similarity to reported tRNA genes and the tRNA scan-SE program. The sequence data and information on the potential genes are available through the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/kaos/.  相似文献   

5.
The complete human dihydrofolate reductase (DHFR) gene has been cloned from four recombinant lambda libraries constructed with the DNA from a methotrexate-resistant human cell line with amplified DHFR genes. The detailed organization of the gene has been determined by restriction mapping of the cloned fragments and DNA sequencing of all the protein coding regions and adjacent intron segments, and shown to correspond to that of the native human DHFR gene. The gene spans a length of approximately 29 X 10(3) bases from the ATG initiator codon to the end of the 3' untranslated region, and contains five introns that interrupt the protein coding sequence. The number and positions of introns are identical to those found in the mouse gene. By contrast, the size of the homologous introns (with the exception of the first one) varies greatly, up to several fold, in the genes from man, mouse and Chinese hamster; the intron sequences also exhibit a great divergence, except in the junction regions. A striking sequence homology, extending over several hundred nucleotides, exists between the human and mouse gene 5' non-coding regions. These regions are characterized by an unusually high G + C content, 72% and 66% in the human and mouse genes, respectively, which is maintained in the first coding segment and first intron, and is in sharp contrast to the relatively low G + C content (approximately 40%) of the remainder of the gene.  相似文献   

6.
7.
8.
9.
10.
Nucleotide sequence of the gene for the b subunit of human factor XIII   总被引:9,自引:0,他引:9  
R E Bottenus  A Ichinose  E W Davie 《Biochemistry》1990,29(51):11195-11209
Factor XIII (Mr 320,000) is a blood coagulation factor that stabilizes and strengthens the fibrin clot. It circulates in blood as a tetramer composed of two a subunits (Mr 75,000 each) and two b subunits (Mr 80,000 each). The b subunit consists of 641 amino acids and includes 10 tandem repeats of 60 amino acids known as GP-I structures, short consensus repeats (SCR), or sushi domains. In the present study, the human gene for the b subunit has been isolated from three different genomic libraries prepared in lambda phage. Fifteen independent phage with inserts coding for the entire gene were isolated and characterized by restriction mapping, Southern blotting, and DNA sequencing. The gene was found to be 28 kilobases in length and consisted of 12 exons (I-XII) separated by 11 intervening sequences. The leader sequence was encoded by exon I, while the carbonyl-terminal region of the protein was encoded by exon XII. Exons II-XI each coded for a single sushi domain, suggesting that the gene evolved through exon shuffling and duplication. The 12 exons in the gene ranged in size from 64 to 222 base pairs, while the introns ranged in size from 87 to 9970 nucleotides and made up 92% of the gene. The introns contained four Alu repetitive sequences, one each in introns A, E, I, and J. A fifth Alu repeat was present in the flanking 3' end of the gene. Two partial KpnI repeats were also found in the introns, including one in intron I and one in intron J. The KpnI repeat in intron J was 89% homologous to a sequence of approximately 2200 nucleotides flanking the gene coding for human beta globin and approximately 3800 nucleotides from the L1 insertion present in the gene for human factor VIII. Intron H also contained an "O" family repeat, while two potential regions for Z-DNA were identified within introns G and J. One nucleotide change was found in the coding region of the gene when its sequence was compared to that of the cDNA. This difference, however, did not result in a change in the amino acid sequence of the protein.  相似文献   

11.
本研究首先从桃中克隆了AGAMOUS(AG)同源基因PpMADS4的第二个内含子--pPpMADS4,全长约2.1kb。序列分析表明,该内含子含有一些对基因表达十分重要的调控元件。同时,克隆了七个不同桃品种中的PpMADS4基因的第二内含子,序列比对和SNP测算表明,PpMADS4第二内含子是一段SNP富集区域,具有高度的核苷酸多态性,但是在这段序列上各个调控元件的序列和位置都非常保守,暗示了这些调控元件可能具有很重要的生物学功能。为了认识这一内含子的调控功能,将pPpMADS4与minimal35S连接并与GUS基因融合,构建表达载体转入野生型拟南芥中。GUS染色显示,其表达主要分布在花的两轮生殖器官上,这与拟南芥中AG第二内含子调控的GUS着色部位相似,但存在着差异。PpMADS4第二内含子能够特异启动GUS在花发育晚期的表达。  相似文献   

12.
The rat cytochrome P-450d gene which is inducibly expressed by the administration of 3-methylcholanthrene (MC) has been cloned and analyzed for the complete nucleotide sequence. The gene is 6.9 kilobases long and is separated into 7 exons by 6 introns. The insertion sites of the introns in this gene are well-conserved as compared with those of another MC-inducible cytochrome P-450c gene, but are completely different from those of a phenobarbital-inducible cytochrome P-450e gene. The overall homologies in the coding nucleotide and deduced amino acid sequences were 75% and 68% between the two MC-inducible cytochrome P-450 genes, respectively. The similarity of the gene organization between cytochrome P-450d and P-450c as well as their homology in the deduced amino acid and the nucleotide sequences suggests that these two genes of MC-inducible cytochromes P-450 constitute a different subfamily than those of the phenobarbital-inducible one in the cytochrome P-450 gene family. In contrast with the notable sequence homology in the coding region of the two MC-inducible cytochromes P-450, all the introns and the 5'- and 3'-flanking regions of the two genes showed virtually no sequence homology between them except for several short DNA segments that are located in the promoter region and the first intron. The nucleotide sequences and the locations of these conserved short DNA segments in the two genes suggest that they may affect the expression of the genes. Middle repetitive sequence reported as ID or identifier sequence were found in and in the vicinity of the cytochrome P-450d gene.  相似文献   

13.
To deduce the entire sequence of the top arm of the Arabidopsis thaliana chromosome 3, the sequence determination was performed on a total of 90 P1, TAC and BAC clones chosen according to our sequencing strategy. Sequence features of the resulting 4,251,695 bp regions were analyzed with various computer programs for similarity search and gene modeling. As a result, a total of 941 potential protein-coding genes were identified. The average density of the genes identified was 1 gene per 4210 bp. Introns were observed in 73% of the genes, and the average number per gene and the average length of the introns were 3.6 and 159 bp, respectively. These sequence features are essentially identical to those of chromosomes 3 and 5 in our previous reports. The regions also contained 14 tRNA genes when searched by similarity to reported tRNA genes and the tRNA scan-SE program. The sequence data and information on the potential genes are available through the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/kaos/.  相似文献   

14.
As determined by computer sequence analysis, the average exon length in Arabidopsis thaliana, Oryza sativa, Caenorhabditis elegans, and Homo sapiens genes decreases with an increasing number of introns. In A. thaliana and O. sativa, variations in intron and exon lengths with an increasing number of introns are highly correlated. Linear correlation is observed between the total exon length and the number of introns, while the gene length increases in proportion to the number of introns. In human, the average intron and gene lengths depended on the gene density in DNA.  相似文献   

15.
A total of 56 TAC clones with an average insert size of 100 kb were isolated from a TAC library of the Lotus japonicus genome based on the expressed sequences tags (ESTs), cDNA and gene information, and their nucleotide sequences were determined according to the shot-gun based strategy. The total length of the sequenced regions is 5,473,195 bp. By comparison with the sequences in protein and EST databases and analysis with computer programs for gene modeling, a total of 605 potential protein-encoding genes with known or predicted functions, 69 gene segments, and 172 pseudogenes were identified. The average density of the genes assigned so far is 1 gene/8120 bp. Introns were identified in approximately 78% of the potential genes. There was an average of 3.8 introns per gene and the average length of the introns was 375 bp. DNA markers were generated based on the nucleotide sequences obtained, and each clone was mapped onto the linkage map using the F2 mapping population derived from a cross of L. japonicus Gifu B-129 and Miyakojima MG-20. The sequence data, gene information and mapping information are available through the World Wide Web at http://www.kazusa.or.jp/lotus/.  相似文献   

16.
17.
Organization of the human protein S genes   总被引:6,自引:0,他引:6  
Human genomic clones that span the entire protein S expressed gene (PS alpha) and the 3' two-thirds of the protein S pseudogene (PS beta) have been isolated and characterized. The PS alpha gene is greater than 80 kilobases in length and contains 14 introns and 15 exons, as well as 6 repetitive "Alu" sequences. Exons I and XV contain 112 and 1139 bp 5' and 3' noncoding segments in addition to the amino and carboxyl termini, respectively. Exons I-VIII encode protein segments that are homologous to the vitamin K dependent clotting proteins and are bounded by introns whose position and type are identical with other members of this protein family. Exons IX-XV encode protein segments homologous to sex hormone binding globulin (SHBG) and are bounded by introns of identical type and position as in the SHBG gene. Genomic clones for the PS beta gene cover a distance of greater than 55 kilobases and contain segments corresponding to amino acids 46-635 of the mature protein and the 1.1-kb 3' noncoding region of the cDNA. The presence of multiple base changes in the coding portions of this gene, resulting in termination codons and frame shifts, suggests that it is a pseudogene. Comparison of DNA sequences for the two genes reveals 97% identity for coding and 3' noncoding, and 95.4% for intronic regions, suggesting divergence of the two genes is a relatively recent event.  相似文献   

18.
The nucleotide sequences of the entire gene family, comprising six genes, that encodes the Rubisco small subunit (rbcS) multigene family in Mesembryanthemum crystallinum (common ice plant), were determined. Five of the genes are arranged in a tandem array spanning 20 kb, while the sixth gene is not closely linked to this array. The mature small subunit coding regions are highly conserved and encode four distinct polypeptides of equal lengths with up to five amino acid differences distinguishing individual genes. The transit peptide coding regions are more divergent in both amino acid sequence and length, encoding five distinct peptide sequences that range from 55 to 61 amino acids in length. Each of the genes has two introns located at conserved sites within the mature peptide-coding regions. The first introns are diverse in sequence and length ranging from 122 by to 1092 bp. Five of the six second introns are highly conserved in sequence and length. Two genes, rbcS-4 and rbcS-5, are identical at the nucleotide level starting from 121 by upstream of the ATG initiation codon to 9 by downstream of the stop codon including the sequences of both introns, indicating recent gene duplication and/or gene conversion. Functionally important regulatory elements identified in rbcS promoters of other species are absent from the upstream regions of all but one of the ice plant rbcS genes. Relative expression levels were determined for the rbcS genes and indicate that they are differentially expressed in leaves.  相似文献   

19.
Two human gamma-crystallin genes are linked and riddled with Alu-repeats   总被引:7,自引:0,他引:7  
A human genomic cosmid clone, pHcos gamma-1, has been isolated containing two closely linked gamma-crystallin genes, oriented in the same direction. The sequence of these genes and their 5' and 3' flanking regions has been determined. The coding regions of both genes are interrupted by two introns. The first introns (94 and 100 bp, respectively) are located in the 5' region of the genes. The second introns (2.82 and 0.95 kb, respectively) divide the genes into two halves, each encoding a structural domain of the gamma-crystallin protein. The coding regions of the two genes show 80% homology. Due to a mutation in the splice acceptor site of the second intron of the first gene, the coding region of its third exon is 3 bp longer than that of the second gene. In the flanking regions several conserved sequence elements were found, including those elements that are known to be necessary for the correct expression of eukaryotic genes. The flanking and intronic regions of the genes contain 'simple sequence' DNA and Alu repeats. The Alu repeats are usually clustered, contain truncated elements, and are often located near simple sequence DNA.  相似文献   

20.
The nucleotide sequences of the entire gene family, comprising six genes, that encodes the Rubisco small subunit (rbcS) multigene family in Mesembryanthemum crystallinum (common ice plant), were determined. Five of the genes are arranged in a tandem array spanning 20 kb, while the sixth gene is not closely linked to this array. The mature small subunit coding regions are highly conserved and encode four distinct polypeptides of equal lengths with up to five amino acid differences distinguishing individual genes. The transit peptide coding regions are more divergent in both amino acid sequence and length, encoding five distinct peptide sequences that range from 55 to 61 amino acids in length. Each of the genes has two introns located at conserved sites within the mature peptide-coding regions. The first introns are diverse in sequence and length ranging from 122 by to 1092 bp. Five of the six second introns are highly conserved in sequence and length. Two genes, rbcS-4 and rbcS-5, are identical at the nucleotide level starting from 121 by upstream of the ATG initiation codon to 9 by downstream of the stop codon including the sequences of both introns, indicating recent gene duplication and/or gene conversion. Functionally important regulatory elements identified in rbcS promoters of other species are absent from the upstream regions of all but one of the ice plant rbcS genes. Relative expression levels were determined for the rbcS genes and indicate that they are differentially expressed in leaves.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号