首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Soybean genomic survey: BAC-end sequences near RFLP and SSR markers.   总被引:8,自引:0,他引:8  
We are building a framework physical infrastructure across the soybean genome by using SSR (simple sequence repeat) and RFLP (restriction fragment length polymorphism) markers to identify BACs (bacterial artificial chromosomes) from two soybean BAC libraries. The libraries were prepared from two genotypes, each digested with a different restriction enzyme. The BACs identified by each marker were grouped into contigs. We have obtained BAC- end sequence from BACs within each contig. The sequences were analyzed by the University of Minnesota Center for Computational Genomics and Bioinformatics using BLAST algorithms to search nucleotide and protein databases. The SSR-identified BACs had a higher percentage of significant BLAST hits than did the RFLP-identified BACs. This difference was due to a higher percentage of hits to repetitive-type sequences for the SSR-identified BACs that was offset in part, however, by a somewhat larger proportion of RFLP-identified significant hits with similarity to experimentally defined genes and soybean ESTs (expressed sequence tags). These genes represented a wide range of metabolic functions. In these analyses, only repetitive sequences from SSR-identified contigs appeared to be clustered. The BAC-end sequences also allowed us to identify microsynteny between soybean and the model plants Arabidopsis thaliana and Medicago truncatula. This map-based approach to genome sampling provides a means of assaying soybean genome structure and organization.  相似文献   

2.
The Fabaceae, the third largest family of plants and the source of many crops, has been the target of many genomic studies. Currently, only the grasses surpass the legumes for the number of publicly available expressed sequence tags (ESTs). The quantity of sequences from diverse plants enables the use of computational approaches to identify novel genes in specific taxa. We used BLAST algorithms to compare unigene sets from Medicago truncatula, Lotus japonicus, and soybean (Glycine max and Glycine soja) to nonlegume unigene sets, to GenBank's nonredundant and EST databases, and to the genomic sequences of rice (Oryza sativa) and Arabidopsis. As a working definition, putatively legume-specific genes had no sequence homology, below a specified threshold, to publicly available sequences of nonlegumes. Using this approach, 2,525 legume-specific EST contigs were identified, of which less than three percent had clear homology to previously characterized legume genes. As a first step toward predicting function, related sequences were clustered to build motifs that could be searched against protein databases. Three families of interest were more deeply characterized: F-box related proteins, Pro-rich proteins, and Cys cluster proteins (CCPs). Of particular interest were the >300 CCPs, primarily from nodules or seeds, with predicted similarity to defensins. Motif searching also identified several previously unknown CCP-like open reading frames in Arabidopsis. Evolutionary analyses of the genomic sequences of several CCPs in M. truncatula suggest that this family has evolved by local duplications and divergent selection.  相似文献   

3.
4.
5.
Non-redundant expressed sequence tags (ESTs) were generated from six different organs at various developmental stages of Chinese cabbage, Brassica rapa L. ssp. pekinensis. Of the 1,295 ESTs, 915 (71%) showed significantly high homology in nucleotide or deduced amino acid sequences with other sequences deposited in databases, while 380 did not show similarity to any sequences. Briefly, 598 ESTs matched with proteins of identified biological function, 177 with hypothetical proteins or non-annotated Arabidopsis genome sequences, and 140 with other ESTs. About 82% of the top-scored matching sequences were from Arabidopsis or Brassica, but overall 558 (43%) ESTs matched with Arabidopsis ESTs at the nucleotide sequence level. This observation strongly supports the idea that gene-expression profiles of Chinese cabbage differ from that of Arabidopsis, despite their genome structures being similar to each other. Moreover, sequence analyses of 21 Brassica ESTs revealed that their primary structure is different from those of corresponding annotated sequences of Arabidopsis genes. Our data suggest that direct prediction of Brassica gene expression pattern based on the information from Arabidopsis genome research has some limitations. Thus, information obtained from the Brassica EST study is useful not only for understanding of unique developmental processes of the plant, but also for the study of Arabidopsis genome structure.  相似文献   

6.
7.
Although GenBank has now covered over 1,400,000 expressed sequence tags (ESTs) from soybean, most ESTs available to the public have been derived from tissues or environmental conditions rather than developing seeds. It is absolutely necessary for annotating the molecular mechanisms of soybean seed development to analyze completely the gene expression profiles of its immature seed at various stages. Here we have constructed a full-length-enriched cDNA library comprised of a total of 45,408 cDNA clones which cover various stages of soybean seed development. Furthermore, we have sequenced from 5′ ends of these clones, 36,656 ESTs were obtained in the present study. These EST sequences could be categorized into 27,982 unigenes, including 22,867 contigs and 5,115 singletons, among which 27,931 could be mapped onto soybean 20 chromosome sequences. Comparative genomic analysis with other plants has revealed that these unigenes include lots of candidate genes specific to dicot, legume and soybean. Approximately 1,789 of these unigenes currently show no homology to known soybean sequences, suggesting that many represent mRNAs specifically expressed in seeds. Novel abundant genes involved in the oil synthesis have been found in this study, may serve as a valuable resource for soybean seed improvement.  相似文献   

8.
9.
Using a strategy requiring only modest computational resources, wheat expressed sequence tag (EST) sequences from various sources were assembled into contigs and compared with a nonredundant barley sequence assembly, with ESTs, with complete draft genome sequences of rice and Arabidopsis thaliana, and with ESTs from other plant species. These comparisons indicate that (i) wheat sequences available from public sources represent a substantial proportion of the diversity of wheat coding sequences, (ii) prediction of open reading frames in the whole genome sequence improves when supplemented with EST information from other species, (iii) a substantial number of candidates for novel genes that are unique to wheat or related species can be identified, and (iv) a smaller number of genes can be identified that are common to monocots and dicots but absent from Arabidopsis. The sequences in the last group may have been lost from Arabidopsis after descendance from a common ancestor. Examples of potential novel wheat genes and Triticeae-specific genes are presented.  相似文献   

10.
Soybean rust is caused by the obligate fungal pathogen Phakopsora pachyrhizi Sydow. A unidirectional cDNA library was constructed using mRNA isolated from germinating P. pachyrhizi urediniospores to identify genes expressed at this physiological stage. Single pass sequence analysis of 908 clones revealed 488 unique expressed sequence tags (ESTs, unigenes) of which 107 appeared as multiple copies. BLASTX analysis identified 189 unigenes with significant similarities (Evalue<10(-5)) to sequences deposited in the NCBI non-redundant protein database. A search against the NCBI dbEST using the BLASTN algorithm revealed 32 ESTs with high or moderate similarities to plant and fungal sequences. Using the Expressed Gene Anatomy Classification, 31.7% of these ESTs were involved in primary metabolism, 14.3% in gene/protein expression, 7.4% in cell structure and growth, 6.9% in cell division, 4.8% in cell signaling/cell communication, and 4.8% in cell/organism defense. Approximately 29.6% of the identities were to hypothetical proteins and proteins with unknown function.  相似文献   

11.
12.
Mining functional microsatellites in legume unigenes   总被引:1,自引:0,他引:1  
Highly polymorphic and transferable microsatellites (SSRs) are important for comparative genomics, genome analysis and phylogenetic studies. Development of novel species-specific microsatellite markers remains a costly and labor-intensive project. Therefore, interest has been shifted from genomic to genic markers owing to their high inter-species transferability as they are developed from conserved coding regions of the genome. This study concentrates on comparative analysis of genic microsatellites in nine important legume (Arachis hypogaea, Cajanus cajan, Cicer arietinum, Glycine max, Lotus japonicus, Medicago truncatula, Phaseolus vulgaris, Pisum sativum and Vigna unguiculata) and two model plant species (Oryza sativa and Arabidopsis thaliana). Screening of a total of 228090 putative unique sequences spanning 219610522 bp using a microsatellite search tool, MISA, identified 12.18% of the unigenes containing 36248 microsatellite motifs excluding mononucleotide repeats. Frequency of legume unigene-derived SSRs was one SSR in every 6.0 kb of analyzed sequences. The trinucleotide repeats were predominant in all the unigenes with the exception of C. cajan, which showed prevalence of dinucleotide repeats over trinucleotide repeats. Dinucleotide repeats along with trinucleotides counted for more than 90% of the total microsatellites. Among dinucleotide and trinucleotide repeats, AG and AAG motifs, respectively, were the most frequent. Microsatellite positive chickpea unigenes were assigned Gene Ontology (GO) terms to identify the possible role of unigenes in various molecular and biological functions. These unigene based microsatellite markers will prove valuable for recording allelic variance across germplasm collections, gene tagging and searching for putative candidate genes.  相似文献   

13.
14.
15.
16.
Extended comparison of gene sequences found on homeologous soybean Bacterial Artificial Chromosomes to Medicago truncatula and Arabidopsis thaliana genomic sequences demonstrated a network of synteny within conserved regions interrupted by gene addition and/or deletions. Consolidation of gene order among all 3 species provides a picture of ancestral gene order. The observation supports a genome history of fractionation resulting from gene loss/addition and rearrangement. In all 3 species, clusters of N-hydroxycinnamoyl/benzoyltransferase genes were identified in tandemly duplicated clusters. Parsimony-based gene trees suggest that the genes within the arrays have independently undergone tandem duplication in each species.  相似文献   

17.
18.
Chaetomium cupreum has a potential as biocontrol agent against a range of plant pathogens on the basis of production of antifungal metabolites, mycoparasitism, competition for space and nutrients, or various combinations of these. To explore genes expressed in C. cupreum, a cDNA library was constructed from mycelium and 3,066 expressed sequence tags (ESTs) were generated. Clusters analysis enabled the identification of 1,471 unigenes with 392 contigs and 1,079 singleton sequences. Putative functions were assigned to 874 unigenes that exhibited strong similarity to genes/ESTs in public databases putatively containing genes involved in cellular component, molecular function, and biological process. Other 597 ESTs representing novel genes showed no significant similarity to public database resource of NCBI. A proportion of genes was identified related to degradation of pathogen cell wall, antifungal metabolite production, as was estimated in the biocontrol fungus. The paper described is a first step towards the knowledge of the C. cupreum genome. The results present the useful application of EST analysis on C. cupreum and provide a preliminary indication of gene expression putatively involved in biocontrol.  相似文献   

19.
Expressed Sequence Tag (EST) analysis has pioneered genome-wide gene discovery and expression profiling. In order to establish a gene expression index in the rice cultivar indica, we sequenced and analyzed 86,136 ESTs from nine rice cDNA libraries from the super hybrid cultivar LYP9 and its parental cultivars. We assembled these ESTs into 13,232 contigs and leave 8,976 singletons. Overall, 7,497 sequences were found similar to the existing sequences in GenBank and 14,711 are novel. These sequences are classified by molecular function, biological process and pathways according to the Gene Ontology. We compared our sequenced ESTs with the publicly available 95,000 ESTs from japonica, and found little sequence variation, despite the large difference between genome sequences. We then assembled the combined 173,000 rice ESTs for further analysis. Using the pooled ESTs, we compared gene expression in metabolism pathway between rice and Arabidopsis according to KEGG. We further profiled gene expression pattern  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号