首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
4.
Sixteen Pl and TAC clones assigned to Arabidopsis thaliana chromosome5 were sequenced, and their sequence features were analyzedusing various computer programs. The total length of the sequencesdetermined was 1,013,767 bp. Together with the nucleotide sequencesof 109 clones previously reported, the regions of chromosome5 sequenced so far now total 9,072,622 bp, which presumablycovers approximately one-third of the chromosome. A similaritysearch against the reported gene sequences predicted the presenceof a total of 225 protein-coding genes and/or gene segmentsin the newly sequenced regions, indicating an average gene densityof one gene per 4.5 kb. Introns were identified in 72.4% ofthe potential protein genes for which the entire gene structurewas predicted, and the average number per gene and the averagelength of the introns were 3.3 and 163 bp, respectively. Thesesequence features are essentially identical to those in thepreviously reported sequences. The sequence data and gene informationare available on the World Wide Web database KAOS (Kazusa Arabidopsisdata Opening Site) at http://www.kazusa.or.jp/arabi/.  相似文献   

5.
6.
Nineteen Pl and TAC clones, which have been mapped on the finephysical map of the Arabidopsis thaliana chromosome 5, weresequenced according to the shotgun-based strategy, and theirstructural features were analysed. The total length of the regionssequenced in this study was 1,367,185 bp. Combining this withthe regions covered by 90 P1 and TAC clones proviously reported,the total length of chromosome 5 sequenced to date becomes 8,058,855bp. On the basis of similarity search against protein and ESTdatabases and gene modeling with computer programs, a totalof 330 potential protein-coding regions were identified, bringingan average density of the genes to approximately one gene per4.1 kb. Introns were identified in 81.0% of the potential proteingenes for which the entire gene structure was predicted, withan average number per gene of 4.2 and an average length of theintrons of 180 bp. The RNA-coding genes identified were 9 tRNAgenes corresponding to 8 amino acid species and 2 genes forU2 nuclear RNA. These sequence features are essentially identicalto those in the previously reported sequences. The sequencedata and gene information are available on the World Wide Webdatabase KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/arabi/.  相似文献   

7.
A total of 17 Pl and TAC clones each representing an assigned region of chromosome 5 were isolated from P1 and TAC genomic libraries of Arabidopsis thaliana Columbia, and their nucleotide sequences were determined. The length of the clones sequenced in this study summed up to 1,081,958 bp. As we have previously reported the sequence of 9,072,622 bp by analysis of 125 P1 and TAC clones, the total length of the sequences of chromosome 5 determined so far is now 10,154,580 bp. The sequences were subjected to similarity search against protein and EST databases and analysis with computer programs for gene modeling. As a consequence, a total of 253 potential protein-coding genes with known or predicted functions were identified. The positions of exons which do not show apparent similarity to known genes were also assigned using computer programs for exon prediction. The average density of the genes identified in this study was 1 gene per 4277 bp. Introns were observed in 74% of the potential protein genes, and the average number per gene and the average length of the introns were 4.3 and 168 bp, respectively. The sequence data and gene information are available on the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/arabi/.  相似文献   

8.
A total of 56 TAC clones with an average insert size of 100 kb were isolated from a TAC library of the Lotus japonicus genome based on the expressed sequences tags (ESTs), cDNA and gene information, and their nucleotide sequences were determined according to the shot-gun based strategy. The total length of the sequenced regions is 5,473,195 bp. By comparison with the sequences in protein and EST databases and analysis with computer programs for gene modeling, a total of 605 potential protein-encoding genes with known or predicted functions, 69 gene segments, and 172 pseudogenes were identified. The average density of the genes assigned so far is 1 gene/8120 bp. Introns were identified in approximately 78% of the potential genes. There was an average of 3.8 introns per gene and the average length of the introns was 375 bp. DNA markers were generated based on the nucleotide sequences obtained, and each clone was mapped onto the linkage map using the F2 mapping population derived from a cross of L. japonicus Gifu B-129 and Miyakojima MG-20. The sequence data, gene information and mapping information are available through the World Wide Web at http://www.kazusa.or.jp/lotus/.  相似文献   

9.
In this series of projects sequencing the entire genome of Arabidopsis thaliana chromosome 5, non-redundant P1 and TAC clones have been sequenced according to the fine physical map, and as of May 7, 1999, the sequences of 16.2 Mb representing approximately 60% of chromosome 5 have been accumulated and released at our web site. In parallel, structural features of the sequenced regions have been analyzed by applying a variety of computer programs, and to date we have predicted a total of 2380 potential protein-coding genes in the 10,154,580 bp regions, which are covered by 142 P1 and TAC clones. In this paper, we newly analyzed the structural features of the 1,011,550 bp regions covered by additional 17 P1 and TAC clones, and predicted 298 protein-coding genes. The average density of the genes identified was 1 gene per 3394 bp. Introns were observed in 67% of the genes, and the average number per gene and the average length of the introns were 3.2 and 159 bp, respectively. The gene density became higher than the value estimated in the previously analyzed regions (1 gene per 4,267 bp), as the data in this paper were compiled based on a new standard of gene assignment including the computer-predicted hypothetical genes. The regions also contained 8 tRNA genes when searched by similarity to reported tRNA genes and the tRNA scan-SE program. The sequence data and information on the potential genes are available on the database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/arabi/.  相似文献   

10.
In our ongoing project to deduce the nucleotide sequence of Arabidopsis thaliana chromosome 5, non-redundant P1 and TAC clones have been sequenced on the basis of the fine physical map, and as of January, 2000, the sequences of 16.6 Mb representing approximately 60% of chromosome 5 have been accumulated and released at our web site. Along with the sequence determination, structural features of the sequenced regions have been analyzed by applying a variety of computer programs, and we already predicted a total of 2697 potential protein coding genes in the 11,166,130 bp regions, which are covered by 159 P1 and TAC clones. In this paper, we describe the structural features of the 3,076,755 bp regions covered by newly analyzed 60 P1 and TAC clones. A total of 715 potential protein coding genes were identified, giving an average density of the genes identified of 1 gene per 4001 bp. Introns were observed in 80% of the genes, and the average number per gene and the average length of the introns were 4.5 and 147 bp, respectively. These sequence features are nearly identical to those in our latest report in which the data were compiled based on a new standard of gene assignment including the computer-predicted hypothetical genes. The regions also contained 12 tRNA genes when searched by similarity to reported tRNA genes and the tRNA scan-SE program. The sequence data and information on the potential genes are available through the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/kaos/.  相似文献   

11.
Sixty-five TAC (transformation-competent artificial chromosomes) clones were selected from a genomic library of Lotus japonicus accession MG-20 based on the sequence information of expressed sequences tags (ESTs), cDNA and gene information, and their nucleotide sequences were determined. The average insert size of the TAC clone was approximately 100 kb, and the total length of the sequenced regions in this study is 6,556,100 bp. Together with the nucleotide sequences of 56 TAC clones previously reported, the regions sequenced so far total 12,029,295 bp. By comparison with the sequences in protein and EST databases and by analysis with computer programs for gene modeling, a total of 711 potential protein-encoding genes with known or predicted functions, 239 gene segments and 90 pseudogenes were identified in the newly sequenced regions. The average gene density assigned so far was 1 gene/9140 bp. The average length of the assigned genes was 2.6 kb, which is considerably larger than that assigned in the Arabidopsis thaliana genome (1.9 kb for 6451 genes). Introns were identified in approximately 73% of the potential genes, and the average number and length of the introns per gene were 3.4 and 377 bp, respectively. Simple sequence repeat length polymorphism (SSLP) or derived cleaved amplified polymorphic sequence (dCAPS) markers were generated based on the nucleotide sequences of the genomic clones obtained, and each clone was mapped onto the linkage map using the F2 mapping population derived from a cross of two accessions of L. japonicus, Gifu B-129 and Miyakojima MG-20. The sequence data, gene information and mapping information are available through the World Wide Web at http://www.kazusa.or.jp/lotus/.  相似文献   

12.
A total of sixty-two clones were selected from a TAC (transformation-competent artificial chromosome) genomic library of the Lotus japonicus accession MG-20 based on the sequence information of expressed sequence tags (ESTs), cDNA and gene information, and their nucleotide sequences were determined. The length of the sequenced regions in this study is 6,682,189 bp, and the total length of the regions sequenced so far is 18,711,484 bp together with the nucleotide sequences of 121 TAC clones previously reported. By comparison with the sequences in protein and EST databases and analysis with computer programs for gene modeling, a total of 573 potential protein-coding genes with known or predicted functions, 91 gene segments and 272 pseudogenes were identified in the newly sequenced regions. Each of the sequenced clones was localized onto the linkage map of two accessions of L. japonicus, Gifu B-129 and Miyakojima MG-20, using simple sequence repeat length polymorphism (SSLP) or derived cleaved amplified polymorphic sequence (dCAPS) markers generated based on the nucleotide sequences of the clones. The sequence data, gene information and mapping information are available through the World Wide Web at http://www.kazusa.or.jp/lotus/.  相似文献   

13.
Based on the physical map of Arabidopsis thaliana chromosome 3 previously constructed with CIC YAC, TAC, P1 and BAC clones (Sato, S. et al., DNA Res., 5, 163-168, 1998), a total of 60 P1 and TAC clones were sequenced, and the sequence features of the resulting 4,504,864 bp regions were analyzed by applying various computer programs for similarity search and gene modeling. As a result, a total of 1054 potential protein-coding genes were identified. The average density of the genes identified was 1 gene per 4066 bp. Introns were observed in 77% of the genes, and the average number per gene and the average length of the introns were 3.9 and 156 bp, respectively. These sequence features are essentially identical to those of chromosome 5 in our previous reports, but the gene density was slightly higher than that observed for chromosomes 2 and 4. The regions also contained 10 tRNA genes when searched by similarity to reported tRNA genes and the tRNA scan-SE program. The sequence data and information on the potential genes are available through the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/kaos/.  相似文献   

14.
To deduce the entire sequence of the top arm of the Arabidopsis thaliana chromosome 3, the sequence determination was performed on a total of 90 P1, TAC and BAC clones chosen according to our sequencing strategy. Sequence features of the resulting 4,251,695 bp regions were analyzed with various computer programs for similarity search and gene modeling. As a result, a total of 941 potential protein-coding genes were identified. The average density of the genes identified was 1 gene per 4210 bp. Introns were observed in 73% of the genes, and the average number per gene and the average length of the introns were 3.6 and 159 bp, respectively. These sequence features are essentially identical to those of chromosomes 3 and 5 in our previous reports. The regions also contained 14 tRNA genes when searched by similarity to reported tRNA genes and the tRNA scan-SE program. The sequence data and information on the potential genes are available through the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/kaos/.  相似文献   

15.
We determined the nucleotide sequences of 64 TAC (transformation-competent artificial chromosome) clones selected from genomic libraries of Lotus japonicus accession Miyakojima MG-20 based on the sequence information of expressed sequence tags (ESTs), cDNAs, genes and DNA markers from L. japonicus and other legumes. The length of the DNA regions sequenced in this study was 6,370,255 bp, and the total length of the L. japonicus genome sequenced so far is 32,537,698 bp together with the nucleotide sequences of 256 TAC clones previously reported. Five hundred forty-eight potential protein-encoding genes with known or predicted functions, 127 gene segments and 224 pseudogenes were assigned to the newly sequenced regions by computer prediction and similarity searches against the sequences in protein and EST databases. Based on the nucleotide sequences of the clones, simple sequence repeat length polymorphism (SSLP) or derived cleaved amplified polymorphic sequence (dCAPS) markers were generated, and each clone was genetically localized onto the linkage map of two accessions of L. japonicus, MG-20 and Gifu B-129. The sequence data, gene information and mapping information are available through the World Wide Web at http://www.kazusa.or.jp/lotus/.  相似文献   

16.
Using the sequence information of expressed sequences tags (ESTs), cDNAs and genes from Lotus japonicus and other legumes, 73 TAC (transformation-competent artificial chromosomes) clones were selected from a genomic library of L. japonicus accession MG-20, and their nucleotide sequences were determined. The length of the DNA sequenced in this study was 7,455,959 bp, and the total length of the DNA regions sequenced so far is 26,167,443 bp together with the nucleotide sequences of 183 TAC clones previously reported. By similarity searches against the sequences in protein and EST databases and prediction by computer programs, a total of 699 potential protein-encoding genes with known or predicted functions, 163 gene segments and 267 pseudogenes were assigned to the newly sequenced regions. Based oil the nucleotide sequences of the clones, simple sequence repeat length polymorphism (SSLP) or derived cleaved amplified polymorphic sequence (dCAPS) markers were generated, and each clone was located onto the linkage map of two accessions of L. japonicus, Gifu B-129 and Miyakojima MG-20. The sequence data, gene information and mapping information are available through the World Wide Web at http://www.kazusa.or.jp/lotus/.  相似文献   

17.
The nucleotide sequence of the complete genome of a cyanobacterium,Microcystis aeruginosa NIES-843, was determined. The genomeof M. aeruginosa is a single, circular chromosome of 5 842 795base pairs (bp) in length, with an average GC content of 42.3%.The chromosome comprises 6312 putative protein-encoding genes,two sets of rRNA genes, 42 tRNA genes representing 41 tRNA species,and genes for tmRNA, the B subunit of RNase P, SRP RNA, and6Sa RNA. Forty-five percent of the putative protein-encodingsequences showed sequence similarity to genes of known function,32% were similar to hypothetical genes, and the remaining 23%had no apparent similarity to reported genes. A total of 688kb of the genome, equivalent to 11.8% of the entire genome,were composed of both insertion sequences and miniature inverted-repeattransposable elements. This is indicative of a plasticity ofthe M. aeruginosa genome, through a mechanism that involveshomologous recombination mediated by repetitive DNA elements.In addition to known gene clusters related to the synthesisof microcystin and cyanopeptolin, novel gene clusters that maybe involved in the synthesis and modification of toxic smallpolypeptides were identified. Compared with other cyanobacteria,a relatively small number of genes for two component systemsand a large number of genes for restriction-modification systemswere notable characteristics of the M. aeruginosa genome.  相似文献   

18.
The nucleotide sequence of the entire genome of a filamentous cyanobacterium, Anabaena sp. strain PCC 7120, was determined. The genome of Anabaena consisted of a single chromosome (6,413,771 bp) and six plasmids, designated pCC7120alpha (408,101 bp), pCC7120beta (186,614 bp), pCC7120gamma (101,965 bp), pCC7120delta (55,414 bp), pCC7120epsilon (40,340 bp), and pCC7120zeta (5,584 bp). The chromosome bears 5368 potential protein-encoding genes, four sets of rRNA genes, 48 tRNA genes representing 42 tRNA species, and 4 genes for small structural RNAs. The predicted products of 45% of the potential protein-encoding genes showed sequence similarity to known and predicted proteins of known function, and 27% to translated products of hypothetical genes. The remaining 28% lacked significant similarity to genes for known and predicted proteins in the public DNA databases. More than 60 genes involved in various processes of heterocyst formation and nitrogen fixation were assigned to the chromosome based on their similarity to the reported genes. One hundred and ninety-five genes coding for components of two-component signal transduction systems, nearly 2.5 times as many as those in Synechocystis sp. PCC 6803, were identified on the chromosome. Only 37% of the Anabaena genes showed significant sequence similarity to those of Synechocystis, indicating a high degree of divergence of the gene information between the two cyanobacterial strains.  相似文献   

19.
Within the framework of an international project for the sequencingof the entire Bacillus subtilis genome, a 36-kb chromosome segment,which covers the region between the gnt and iol operons, hasbeen cloned and sequenced. This region (36447 bp) contains 33complete open reading frames (ORFs; genes) including the fourgnt genes and one partial gene. A homology search for the productsof the 33 complete ORFs revealed significant homology to knownproteins in 16 of them such as tetracycline resistance protein(Clostridium perfringens), asparagine synthetase (Arabidopsisthaliana), aldehyde dehydrogenase (Pseudomonas oleovorans),2,5-dichloro-2,5-cyclohexadiene-1,4-diol dehydrogenase (P. paucimobilis),heat shock protein HtpG (Escherichia coli), galactose-protonsymporter (E. coli), auxin-induced protein (common tobacco),glucitol operon repressor (E. coli) and methylmalonate-semialdehydedehydrogenase (P. aeruginosa). Unlike the regions we sequencedso far, this region contained two short sequence multiplications:one was a tandem sequence duplication (409 and 410 bp), andthe other a triplication consisting of two highly conserved118-bp tandem sequences preceded by a less conserved similarsequence (129 bp). The reasons for the presence of these sequencemultiplications in the gnt to iol region were deduced.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号