首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A total of 17 Pl and TAC clones each representing an assigned region of chromosome 5 were isolated from P1 and TAC genomic libraries of Arabidopsis thaliana Columbia, and their nucleotide sequences were determined. The length of the clones sequenced in this study summed up to 1,081,958 bp. As we have previously reported the sequence of 9,072,622 bp by analysis of 125 P1 and TAC clones, the total length of the sequences of chromosome 5 determined so far is now 10,154,580 bp. The sequences were subjected to similarity search against protein and EST databases and analysis with computer programs for gene modeling. As a consequence, a total of 253 potential protein-coding genes with known or predicted functions were identified. The positions of exons which do not show apparent similarity to known genes were also assigned using computer programs for exon prediction. The average density of the genes identified in this study was 1 gene per 4277 bp. Introns were observed in 74% of the potential protein genes, and the average number per gene and the average length of the introns were 4.3 and 168 bp, respectively. The sequence data and gene information are available on the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/arabi/.  相似文献   

2.
The identification of genes involved in host-pathogen interactions is important for the elucidation of mechanisms of disease resistance and host susceptibility. A traditional way to classify the origin of genes sampled from a pool of mixed cDNA is through sequence similarity to known genes from either the pathogen or host organism or other closely related species. This approach does not work when the identified sequence has no close homologues in the sequence databases. In our previous studies, we classified genes using their codon frequencies. This method, however, explicitly required the prediction of CDS regions and thus could not be applied to sequences composed from the non-coding regions of genes. In this study, we show that the use of sliding-window triplet frequencies extends the application of the algorithm to both coding and non-coding sequences and also increases the prediction accuracy of a Support Vector Machine classifier from 95.6+/-0.3 to 96.5+/-0.2. Thus the use of the triplet frequencies increased the prediction accuracy of the new method by more than 20% compared to our previous approach. A functional analysis of sequences detected gene families having significantly higher or lower probability to be correctly classified compared to the average accuracy of the method is described. The server to perform classification of EST sequences using triplet frequencies is available at (URL: http://mips.gsf.de/proj/est3).  相似文献   

3.
Forty-eight resistance (R) genes conferring resistance to various types of pests have been cloned from 12 plant species. Irrespective of the host or the pest type, most R genes share a strong protein sequence similarity especially for domains and motifs. The objective of this study was to identify expressed R genes of wheat, the fraction of which is expected to be very low in the genome. Using modified RNA fingerprinting and data mining approaches we identified 220 expressed R-gene candidates. Of these, 125 sequences structurally resembled known R genes. In addition to 25-87% protein sequence similarity with the known R genes, the sequence, order, and distribution of the domains and motifs were also the same. Among the remaining 95, 17 were probable R-related, 21 were a new class of nucleotide-binding kinases, 21 were probable kinases, and 36 were p-loop-containing unknown sequences. About 76% were rare including 73 novel sequences. Three new R-gene specific motifs were also identified. Physical mapping of the 164 best R-gene candidates on 339 deletion lines localized 121 mappable R-gene candidates to 26 small chromosomal regions encompassing about 16% of the genome. About 90 of the 110 phenotypically characterized wheat R genes corresponding to 18 different pests also mapped in these regions.  相似文献   

4.
5.
A total of 56 TAC clones with an average insert size of 100 kb were isolated from a TAC library of the Lotus japonicus genome based on the expressed sequences tags (ESTs), cDNA and gene information, and their nucleotide sequences were determined according to the shot-gun based strategy. The total length of the sequenced regions is 5,473,195 bp. By comparison with the sequences in protein and EST databases and analysis with computer programs for gene modeling, a total of 605 potential protein-encoding genes with known or predicted functions, 69 gene segments, and 172 pseudogenes were identified. The average density of the genes assigned so far is 1 gene/8120 bp. Introns were identified in approximately 78% of the potential genes. There was an average of 3.8 introns per gene and the average length of the introns was 375 bp. DNA markers were generated based on the nucleotide sequences obtained, and each clone was mapped onto the linkage map using the F2 mapping population derived from a cross of L. japonicus Gifu B-129 and Miyakojima MG-20. The sequence data, gene information and mapping information are available through the World Wide Web at http://www.kazusa.or.jp/lotus/.  相似文献   

6.
Sixty-five TAC (transformation-competent artificial chromosomes) clones were selected from a genomic library of Lotus japonicus accession MG-20 based on the sequence information of expressed sequences tags (ESTs), cDNA and gene information, and their nucleotide sequences were determined. The average insert size of the TAC clone was approximately 100 kb, and the total length of the sequenced regions in this study is 6,556,100 bp. Together with the nucleotide sequences of 56 TAC clones previously reported, the regions sequenced so far total 12,029,295 bp. By comparison with the sequences in protein and EST databases and by analysis with computer programs for gene modeling, a total of 711 potential protein-encoding genes with known or predicted functions, 239 gene segments and 90 pseudogenes were identified in the newly sequenced regions. The average gene density assigned so far was 1 gene/9140 bp. The average length of the assigned genes was 2.6 kb, which is considerably larger than that assigned in the Arabidopsis thaliana genome (1.9 kb for 6451 genes). Introns were identified in approximately 73% of the potential genes, and the average number and length of the introns per gene were 3.4 and 377 bp, respectively. Simple sequence repeat length polymorphism (SSLP) or derived cleaved amplified polymorphic sequence (dCAPS) markers were generated based on the nucleotide sequences of the genomic clones obtained, and each clone was mapped onto the linkage map using the F2 mapping population derived from a cross of two accessions of L. japonicus, Gifu B-129 and Miyakojima MG-20. The sequence data, gene information and mapping information are available through the World Wide Web at http://www.kazusa.or.jp/lotus/.  相似文献   

7.
Colinearity in gene content and order between rice and closely related cereal crops has been a powerful tool for gene identification. Using a comparative genomic approach, we have identified the rice genomic region syntenous to the region of the short arm of wheat chromosome 2D, on which quantitative trait loci (QTLs) for Fusarium head blight (FHB) resistance and for controlling accumulation of the mycotoxin deoxynivalenol (DON) are closely located. Utilizing markers known to reside near the FHB resistance QTL and data from several wheat genetic maps, we have limited the syntenous region to 6.8 Mb of the short arm of rice chromosome 4. From the 6.8-Mb sequence of rice chromosome 4, we found three putative rice genes that could have a role in detoxification of mycotoxins. DNA sequences of these putative rice genes were used in BLAST searches to identify wheat expressed sequence tags (ESTs) exhibiting significant similarity. Combined data from expression analysis and gene mapping of wheat homologues and results of analysis of DON accumulation using doubled haploid populations revealed that a putative gene for multidrug resistance-associated protein (MRP) is a possible candidate for the FHB resistance and/or DON accumulation controlling QTLs on wheat chromosome 2DS and can be used as a molecular marker to eliminate the susceptible allele when the Chinese wheat variety Sumai 3 is used as a resistance source. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

8.
9.
10.
11.
Yu J  Woloshuk CP  Bhatnagar D  Cleveland TE 《Gene》2000,246(1-2):157-167
A collection of lethal and semi-lethal P-element insertions in the 70CD region of chromosome 3 of Drosophila melanogaster was used to investigate genes and gene arrangements by a combination of genetic, cytological, functional and molecular methods. The 12 lethal insertions studied fall into seven complementation groups of six genes. Lethal phases, expression patterns and other phenotypic aspects of these genes were determined. The genes and additional available sequences were placed on cloned genomic DNA fragments and arranged in an EcoRI map of 150kb that covers approximately the bands 70C7-8 to 70D1. Determination of deficiency breakpoints links the genetic, physical and molecular data. The sequences adjacent to seven independent P-element insertions were established after plasmid rescue or polymerase chain reaction. Similarity searches allowed the assignment of the P-element insertions to known mutations, expressed sequence tags, sequence tagged sites, or homologous genes of other species. Among these were identified a putative transacylase, a putative cell cycle gene, and the gene responsible for the dominant Polycomb-suppressor phenotype of devenir. The genomic sequence of the l(3)70Ca/b gene reveals a novel heat shock protein (hsc70Cb). l(3)70Da was identified as a member of the CDC48/PEX1 ATPase family and its coding sequence was determined.  相似文献   

12.
A total of sixty-two clones were selected from a TAC (transformation-competent artificial chromosome) genomic library of the Lotus japonicus accession MG-20 based on the sequence information of expressed sequence tags (ESTs), cDNA and gene information, and their nucleotide sequences were determined. The length of the sequenced regions in this study is 6,682,189 bp, and the total length of the regions sequenced so far is 18,711,484 bp together with the nucleotide sequences of 121 TAC clones previously reported. By comparison with the sequences in protein and EST databases and analysis with computer programs for gene modeling, a total of 573 potential protein-coding genes with known or predicted functions, 91 gene segments and 272 pseudogenes were identified in the newly sequenced regions. Each of the sequenced clones was localized onto the linkage map of two accessions of L. japonicus, Gifu B-129 and Miyakojima MG-20, using simple sequence repeat length polymorphism (SSLP) or derived cleaved amplified polymorphic sequence (dCAPS) markers generated based on the nucleotide sequences of the clones. The sequence data, gene information and mapping information are available through the World Wide Web at http://www.kazusa.or.jp/lotus/.  相似文献   

13.
A panel of 17 tetraploid and 11 diploid potato genotypes was screened by comparative sequence analysis of polymerase chain reaction (PCR) products for single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (InDels), in regions of the potato genome where genes for qualitative and/or quantitative resistance to different pathogens have been localized. Most SNP and InDel markers were derived from bacterial artificial chromosome (BAC) insertions that contain sequences similar to the family of plant genes for pathogen resistance having nucleotide-binding-site and leucine-rich-repeat domains (NBS-LRR-type genes). Forty-four such NBS-LRR-type genes containing BAC-insertions were mapped to 14 loci, which tag most known resistance quantitative trait loci (QTL) in potato. Resistance QTL not linked to known resistance-gene-like (RGL) sequences were tagged with other markers. In total, 78 genomic DNA fragments with an overall length of 31 kb were comparatively sequenced in the panel of 28 genotypes. 1498 SNPs and 127 InDels were identified, which corresponded, on average, to one SNP every 21 base pairs and one InDel every 243 base pairs. The nucleotide diversity of the tetraploid genotypes (pi = 0.72 x 10(-3)) was lower when compared with diploid genotypes (pi = 2.31 x 10(-3)). RGL sequences showed higher nucleotide diversity when compared with other sequences, suggesting evolution by divergent selection. Information on sequences, sequence similarities, SNPs and InDels is provided in a database that can be queried via the Internet.  相似文献   

14.
15.
The majority of verified plant disease resistance genes isolated to date are of the NBS-LRR class, encoding proteins with a predicted nucleotide binding site (NBS) and a leucine-rich repeat (LRR) region. We took advantage of the sequence conservation in the NBS motif to clone, by PCR, gene fragments from barley representing putative disease resistance genes of this class. Over 30 different resistance gene analogs (RGAs) were isolated from the barley cultivar Regatta. These were grouped into 13 classes based on DNA sequence similarity. Actively transcribed genes were identified from all classes but one, and cDNA clones were isolated to derive the complete NBS-LRR protein sequences. Some of the NBS-LRR genes exhibited variation with respect to whether and where particular introns were spliced, as well as frequent premature polyadenylation. DNA sequences related to the majority of the barley RGAs were identified in the recently expanded public rice genomic sequence database, indicating that the rice sequence can be used to extract a large proportion of the RGAs from barley and other cereals. Using a combination of RFLP and PCR marker techniques, representatives of all barley RGA gene classes were mapped in the barley genome, to all chromosomes except 4H. A number of the RGA loci map in the vicinity of known disease resistance loci, and the association between RGA S-120 and the nematode resistance locus Ha2 on chromosome 2H was further tested by co-segregation analysis. Most of the RGA sequences reported here have not been described previously, and represent a useful resource as candidates or molecular markers for disease resistance genes in barley and other cereals.  相似文献   

16.
Fuchs B  Zhang K  Bolander ME  Sarkar G 《Gene》2000,258(1-2):155-163
The need for rapid identification of differentially expressed genes will persist even after the complete human genomic sequence becomes available. The most popular method for identifying differentially expressed genes acquires expressed sequence tags (ESTs) from the extreme 3' non-coding end of mRNAs. Such ESTs have limitations for downstream applications. We have developed a method, termed preferential amplification of coding sequences (PACS), that was applied to identify differentially expressed coding sequence tags (dCSTs) between osteoblasts and osteosarcoma cells. PACS was achieved by PCR with a set of primers to anchor at sequences complementary to AUG sequences in mRNAs and another set of primers to anchor at a PCR-amplifiable distance from AUG sequences. An initial screen identified 103 candidate dCSTs after screening approximately 15% of the expressed genes between the two cell types. Of these sequences, 27 represent CSTs of known genes and two are from 3'-ESTs of known mRNAs. Thus, PACS identified CSTs approximately 13.5 times more often than it identified 3' ESTs, attesting to the objective of the method. Since many of the dCSTs represent known genes, their identity and potential relevance to osteosarcoma could be immediately hypothesized. Differential expression of many of the dCSTs was further demonstrated by northern blotting or RT-PCR. Since PACS is not dependent on the existence of a poly A tail on an mRNA, it should have application to identify dCSTs for both prokaryotic and eukaryotic organisms. Additionally, PACS should aid in the identification of cell-specific or tissue-specific genes and bidirectional acquisition of cDNA sequence enabling rapid retrieval of full-length cDNA sequence of novel genes.  相似文献   

17.
We have designed a high-throughput system for the identification of novel crystal protein genes (cry) from Bacillus thuringiensis strains. The system was developed with two goals: (i) to acquire the mixed plasmid-enriched genomic sequence of B. thuringiensis using next-generation sequencing biotechnology, and (ii) to identify cry genes with a computational pipeline (using BtToxin_scanner). In our pipeline method, we employed three different kinds of well-developed prediction methods, BLAST, hidden Markov model (HMM), and support vector machine (SVM), to predict the presence of Cry toxin genes. The pipeline proved to be fast (average speed, 1.02 Mb/min for proteins and open reading frames [ORFs] and 1.80 Mb/min for nucleotide sequences), sensitive (it detected 40% more protein toxin genes than a keyword extraction method using genomic sequences downloaded from GenBank), and highly specific. Twenty-one strains from our laboratory's collection were selected based on their plasmid pattern and/or crystal morphology. The plasmid-enriched genomic DNA was extracted from these strains and mixed for Illumina sequencing. The sequencing data were de novo assembled, and a total of 113 candidate cry sequences were identified using the computational pipeline. Twenty-seven candidate sequences were selected on the basis of their low level of sequence identity to known cry genes, and eight full-length genes were obtained with PCR. Finally, three new cry-type genes (primary ranks) and five cry holotypes, which were designated cry8Ac1, cry7Ha1, cry21Ca1, cry32Fa1, and cry21Da1 by the B. thuringiensis Toxin Nomenclature Committee, were identified. The system described here is both efficient and cost-effective and can greatly accelerate the discovery of novel cry genes.  相似文献   

18.
19.
利用抑制差减杂交技术分离受水稻抗性调控的褐飞虱基因   总被引:2,自引:0,他引:2  
杨之帆  陈永勤  李春华  蒋思婧 《昆虫学报》2009,52(10):1059-1067
为分离受水稻抗性调控的褐飞虱Nilaparvata lugens基因, 以取食感虫水稻台中1号和高抗水稻B5的2叶1芯秧苗24 h的褐飞虱4龄若虫为起始材料, 采用抑制差减杂交技术构建了两个群体间的正反向差减cDNA文库。通过斑点杂交从差减文库中筛选代表受水稻抗性调控的基因的cDNA克隆, 进行测序和功能分析, 挑选具功能的基因进行Northern杂交验证。结果表明, 通过斑点杂交筛选到的98个阳性克隆代表92个互不重复的单基因, 其中25个与动物的已知蛋白基因存在较高的同源性。Northern杂交表明, 这25个基因有11个表达上调, 8个表达下调, 提示它们可能在褐飞虱适应抗性水稻过程中发挥了重要作用。本研究结果为克隆上述新基因的全长cDNA序列及进一步研究其在褐飞虱与水稻互作中的功能奠定了基础。  相似文献   

20.
Guo H  Moose SP 《The Plant cell》2003,15(5):1143-1158
Surveys for conserved noncoding sequences (CNS) among genes from monocot cereal species were conducted to assess the general properties of CNS in grass genomes and their correlation with known promoter regulatory elements. Initial comparisons of 11 orthologous maize-rice gene pairs found that previously defined regulatory motifs could be identified within short CNS but could not be distinguished reliably from random sequence matches. Among the different phylogenetic footprinting algorithms tested, the VISTA tool yielded the most informative alignments of noncoding sequence. VISTA was used to survey for CNS among all publicly available genomic sequences from maize, rice, wheat, barley, and sorghum, representing >300 gene comparisons. Comparisons of orthologous maize-rice and maize-sorghum gene pairs identified 20 bp as a minimal length criterion for a significant CNS among grass genes, with few such CNS found to be conserved across rice, maize, sorghum, and barley. The frequency and length of cereal CNS as well as nucleotide substitution rates within CNS were consistent with the known phylogenetic distances among the species compared. The implications of these findings for the evolution of cereal gene promoter sequences and the utility of using the nearly completed rice genome sequence to predict candidate regulatory elements in other cereal genes by phylogenetic footprinting are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号