首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 484 毫秒
1.
2.
3.
4.
Development and annotation of perennial Triticeae ESTs and SSR markers   总被引:2,自引:0,他引:2  
Triticeae contains hundreds of species of both annual and perennial types. Although substantial genomic tools are available for annual Triticeae cereals such as wheat and barley, the perennial Triticeae lack sufficient genomic resources for genetic mapping or diversity research. To increase the amount of sequence information available in the perennial Triticeae, three expressed sequence tag (EST) libraries were developed and annotated for Pseudoroegneria spicata, a mixture of both Elymus wawawaiensis and E. lanceolatus, and a Leymus cinereus x L. triticoides interspecific hybrid. The ESTs were combined into unigene sets of 8 780 unigenes for P. spicata, 11 281 unigenes for Leymus, and 7 212 unigenes for Elymus. Unigenes were annotated based on putative orthology to genes from rice, wheat, barley, other Poaceae, Arabidopsis, and the non-redundant database of the NCBI. Simple sequence repeat (SSR) markers were developed, tested for amplification and polymorphism, and aligned to the rice genome. Leymus EST markers homologous to rice chromosome 2 genes were syntenous on Leymus homeologous groups 6a and 6b (previously 1b), demonstrating promise for in silico comparative mapping. All ESTs and SSR markers are available on an EST information management and annotation database (http://titan.biotec.uiuc.edu/triticeae/).  相似文献   

5.
6.
Using a strategy requiring only modest computational resources, wheat expressed sequence tag (EST) sequences from various sources were assembled into contigs and compared with a nonredundant barley sequence assembly, with ESTs, with complete draft genome sequences of rice and Arabidopsis thaliana, and with ESTs from other plant species. These comparisons indicate that (i) wheat sequences available from public sources represent a substantial proportion of the diversity of wheat coding sequences, (ii) prediction of open reading frames in the whole genome sequence improves when supplemented with EST information from other species, (iii) a substantial number of candidates for novel genes that are unique to wheat or related species can be identified, and (iv) a smaller number of genes can be identified that are common to monocots and dicots but absent from Arabidopsis. The sequences in the last group may have been lost from Arabidopsis after descendance from a common ancestor. Examples of potential novel wheat genes and Triticeae-specific genes are presented.  相似文献   

7.
The genome of Arabidopsis has been searched for sequences of genes involved in acyl lipid metabolism. Over 600 encoded proteins have been identified, cataloged, and classified according to predicted function, subcellular location, and alternative splicing. At least one-third of these proteins were previously annotated as "unknown function" or with functions unrelated to acyl lipid metabolism; therefore, this study has improved the annotation of over 200 genes. In particular, annotation of the lipolytic enzyme group (at least 110 members total) has been improved by the critical examination of the biochemical literature and the sequences of the numerous proteins annotated as "lipases." In addition, expressed sequence tag (EST) data have been surveyed, and more than 3,700 ESTs associated with the genes were cataloged. Statistical analysis of the number of ESTs associated with specific cDNA libraries has allowed calculation of probabilities of differential expression between different organs. More than 130 genes have been identified with a statistical probability > 0.95 of preferential expression in seed, leaf, root, or flower. All the data are available as a Web-based database, the Arabidopsis Lipid Gene database (http://www.plantbiology.msu.edu/lipids/genesurvey/index.htm). The combination of the data of the Lipid Gene Catalog and the EST analysis can be used to gain insights into differential expression of gene family members and sets of pathway-specific genes, which in turn will guide studies to understand specific functions of individual genes.  相似文献   

8.
EVidenceModeler (EVM) is presented as an automated eukaryotic gene structure annotation tool that reports eukaryotic gene structures as a weighted consensus of all available evidence. EVM, when combined with the Program to Assemble Spliced Alignments (PASA), yields a comprehensive, configurable annotation system that predicts protein-coding genes and alternatively spliced isoforms. Our experiments on both rice and human genome sequences demonstrate that EVM produces automated gene structure annotation approaching the quality of manual curation.  相似文献   

9.
10.
MOTIVATION: Accurate gene structure annotation is a challenging computational problem in genomics. The best results are achieved with spliced alignment of full-length cDNAs or multiple expressed sequence tags (ESTs) with sufficient overlap to cover the entire gene. For most species, cDNA and EST collections are far from comprehensive. We sought to overcome this bottleneck by exploring the possibility of using combined EST resources from fairly diverged species that still share a common gene space. Previous spliced alignment tools were found inadequate for this task because they rely on very high sequence similarity between the ESTs and the genomic DNA. RESULTS: We have developed a computer program, GeneSeqer, which is capable of aligning thousands of ESTs with a long genomic sequence in a reasonable amount of time. The algorithm is uniquely designed to tolerate a high percentage of mismatches and insertions or deletions in the EST relative to the genomic template. This feature allows use of non-cognate ESTs for gene structure prediction, including ESTs derived from duplicated genes and homologous genes from related species. The increased gene prediction sensitivity results in part from novel splice site prediction models that are also available as a stand-alone splice site prediction tool. We assessed GeneSeqer performance relative to a standard Arabidopsis thaliana gene set and demonstrate its utility for plant genome annotation. In particular, we propose that this method provides a timely tool for the annotation of the rice genome, using abundant ESTs from other cereals and plants. AVAILABILITY: The source code is available for download at http://bioinformatics.iastate.edu/bioinformatics2go/gs/download.html. Web servers for Arabidopsis and other plant species are accessible at http://www.plantgdb.org/cgi-bin/AtGeneSeqer.cgi and http://www.plantgdb.org/cgi-bin/GeneSeqer.cgi, respectively. For non-plant species, use http://bioinformatics.iastate.edu/cgi-bin/gs.cgi. The splice site prediction tool (SplicePredictor) is distributed with the GeneSeqer code. A SplicePredictor web server is available at http://bioinformatics.iastate.edu/cgi-bin/sp.cgi  相似文献   

11.

Background

Infection of plants by pathogens and the subsequent disease development involves substantial changes in the biochemistry and physiology of both partners. Analysis of genes that are expressed during these interactions represents a powerful strategy to obtain insights into the molecular events underlying these changes. We have employed expressed sequence tag (EST) analysis to identify rice genes involved in defense responses against infection by the blast fungus Magnaporthe oryzae and fungal genes involved in infectious growth within the host during a compatible interaction.

Results

A cDNA library was constructed with RNA from rice leaves (Oryza sativa cv. Hwacheong) infected with M. oryzae strain KJ201. To enrich for fungal genes, subtraction library using PCR-based suppression subtractive hybridization was constructed with RNA from infected rice leaves as a tester and that from uninfected rice leaves as the driver. A total of 4,148 clones from two libraries were sequenced to generate 2,302 non-redundant ESTs. Of these, 712 and 1,562 ESTs could be identified to encode fungal and rice genes, respectively. To predict gene function, Gene Ontology (GO) analysis was applied, with 31% and 32% of rice and fungal ESTs being assigned to GO terms, respectively. One hundred uniESTs were found to be specific to fungal infection EST. More than 80 full-length fungal cDNA sequences were used to validate ab initio annotated gene model of M. oryzae genome sequence.

Conclusion

This study shows the power of ESTs to refine genome annotation and functional characterization. Results of this work have advanced our understanding of the molecular mechanisms underpinning fungal-plant interactions and formed the basis for new hypothesis.  相似文献   

12.
Expressed sequence tags (ESTs) currently encompass more entries in the public databases than any other form of sequence data. Thus, EST data sets provide a vast resource for gene identification and expression profiling. We have mapped the complete set of 176,915 publicly available Arabidopsis EST sequences onto the Arabidopsis genome using GeneSeqer, a spliced alignment program incorporating sequence similarity and splice site scoring. About 96% of the available ESTs could be properly aligned with a genomic locus, with the remaining ESTs deriving from organelle genomes and non-Arabidopsis sources or displaying insufficient sequence quality for alignment. The mapping provides verified sets of EST clusters for evaluation of EST clustering programs. Analysis of the spliced alignments suggests corrections to current gene structure annotation and provides examples of alternative and non-canonical pre-mRNA splicing. All results of this study were parsed into a database and are accessible via a flexible Web interface at http://www.plantgdb.org/AtGDB/.  相似文献   

13.
14.
We have developed a rice (Oryza sativa) genome annotation database (Osa1) that provides structural and functional annotation for this emerging model species. Using the sequence of O. sativa subsp. japonica cv Nipponbare from the International Rice Genome Sequencing Project, pseudomolecules, or virtual contigs, of the 12 rice chromosomes were constructed. Our most recent release, version 3, represents our third build of the pseudomolecules and is composed of 98% finished sequence. Genes were identified using a series of computational methods developed for Arabidopsis (Arabidopsis thaliana) that were modified for use with the rice genome. In release 3 of our annotation, we identified 57,915 genes, of which 14,196 are related to transposable elements. Of these 43,719 non-transposable element-related genes, 18,545 (42.4%) were annotated with a putative function, 5,777 (13.2%) were annotated as encoding an expressed protein with no known function, and the remaining 19,397 (44.4%) were annotated as encoding a hypothetical protein. Multiple splice forms (5,873) were detected for 2,538 genes, resulting in a total of 61,250 gene models in the rice genome. We incorporated experimental evidence into 18,252 gene models to improve the quality of the structural annotation. A series of functional data types has been annotated for the rice genome that includes alignment with genetic markers, assignment of gene ontologies, identification of flanking sequence tags, alignment with homologs from related species, and syntenic mapping with other cereal species. All structural and functional annotation data are available through interactive search and display windows as well as through download of flat files. To integrate the data with other genome projects, the annotation data are available through a Distributed Annotation System and a Genome Browser. All data can be obtained through the project Web pages at http://rice.tigr.org.  相似文献   

15.
16.
17.
Non-redundant expressed sequence tags (ESTs) were generated from six different organs at various developmental stages of Chinese cabbage, Brassica rapa L. ssp. pekinensis. Of the 1,295 ESTs, 915 (71%) showed significantly high homology in nucleotide or deduced amino acid sequences with other sequences deposited in databases, while 380 did not show similarity to any sequences. Briefly, 598 ESTs matched with proteins of identified biological function, 177 with hypothetical proteins or non-annotated Arabidopsis genome sequences, and 140 with other ESTs. About 82% of the top-scored matching sequences were from Arabidopsis or Brassica, but overall 558 (43%) ESTs matched with Arabidopsis ESTs at the nucleotide sequence level. This observation strongly supports the idea that gene-expression profiles of Chinese cabbage differ from that of Arabidopsis, despite their genome structures being similar to each other. Moreover, sequence analyses of 21 Brassica ESTs revealed that their primary structure is different from those of corresponding annotated sequences of Arabidopsis genes. Our data suggest that direct prediction of Brassica gene expression pattern based on the information from Arabidopsis genome research has some limitations. Thus, information obtained from the Brassica EST study is useful not only for understanding of unique developmental processes of the plant, but also for the study of Arabidopsis genome structure.  相似文献   

18.
The first comprehensive comparison of gene content between higher plant species provided the unexpected conclusions that rice contained about twice as many genes as Arabidopsis, and that about half of the rice genes had no obvious homologs in any other organism. Our subsequent analyses indicate that most of these "extra, novel" rice genes are mis-annotated segments of transposable elements, especially retrotransposons. Aggressive annotation of a randomly selected subset of the rice genome suggests that the gene number is less than 40000. The five fantasies of automated plant gene discovery are described and a protocol is provided to minimize (or at least predict) the inaccuracy of future plant genome annotations.  相似文献   

19.
Comparative cross-species alternative splicing in plants   总被引:1,自引:0,他引:1       下载免费PDF全文
Alternative splicing (AS) can add significantly to genome complexity. Plants are thought to exhibit less AS than animals. An algorithm, based on expressed sequence tag (EST) pairs gapped alignment, was developed that takes advantage of the relatively small intron and exon size in plants and directly compares pairs of ESTs to search for AS. EST pairs gapped alignment was first evaluated in Arabidopsis (Arabidopsis thaliana), rice (Oryza sativa), and tomato (Solanum lycopersicum) for which annotated genome sequence is available and was shown to accurately predict splicing events. The method was then applied to 11 plant species that include 17 cultivars for which enough ESTs are available. The results show a large, 3.7-fold difference in AS rates between plant species with Arabidopsis and rice in the lower range and lettuce (Lactuca sativa) and sorghum (Sorghum bicolor) in the upper range. Hence, compared to higher animals, plants show a much greater degree of variety in their AS rates and in some plant species the rates of animal and plant AS are comparable although the distribution of AS types may differ. In eudicots but not monocots, a correlation between genome size and AS rates was detected, implying that in eudicots the mechanisms that lead to larger genomes are a driving force for the evolution of AS.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号