首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Compared to rice, wheat exhibits characteristic growth habits and contains complex genome constituents. To assess global changes in gene expression patterns in the wheat life cycle, we conducted large-scale analysis of expressed sequence tags (ESTs) in common wheat. Ten wheat tissues were used to construct cDNA libraries: crown and root from 14-day-old seedlings; spikelet from early and late flowering stages; spike at the booting stage, heading date and flowering date; pistil at the heading date; and seeds at 10 and 30 days post-anthesis. Several thousand colonies were randomly selected from each of these 10 cDNA libraries and sequenced from both 5' and 3' ends. Consequently, a total of 116 232 sequences were accumulated and classified into 25 971 contigs based on sequence homology. By computing abundantly expressed ESTs, correlated expression patterns of genes across the tissues were identified. Furthermore, relationships of gene expression profiles among the 10 wheat tissues were inferred from global gene expression patterns. Genes with similar functions were grouped with one another by clustering gene expression profiles. This technique might enable estimation of the functions of anonymous genes. Multidimensional analysis of EST data that is analogous to the microarray experiments may offer new approaches to functional genomics of plants.  相似文献   

2.
Mining single-nucleotide polymorphisms from hexaploid wheat ESTs.   总被引:20,自引:0,他引:20  
Single-nucleotide polymorphisms (SNPs) represent a new form of functional marker, particularly when they are derived from expressed sequence tags (ESTs). A bioinformatics strategy was developed to discover SNPs within a large wheat EST database and to demonstrate the utility of SNPs in genetic mapping and genetic diversity applications. A collection of > 90000 wheat ESTs was assembled into contiguous sequences (contigs), and 45 random contigs were then visually inspected to identify primer pairs capable of amplifying specific alleles. We estimate that homoeologue sequence variants occurred 1 in 24 bp and the frequency of SNPs between wheat genotypes was 1 SNP/540 bp (theta = 0.0069). Furthermore, we estimate that one diagnostic SNP test can be developed from every contig with 10-60 EST members. Thus, EST databases are an abundant source of SNP markers. Polymorphism information content for SNPs ranged from 0.04 to 0.50 and ESTs could be mapped into a framework of microsatellite markers using segregating populations. The results showed that SNPs in wheat can be discovered in ESTs, validated, and be applied to conventional genetic studies.  相似文献   

3.
Gibberella zeae is a broad host range pathogen that infects many crop plants, including wheat and barley, and causes head blight and rot diseases throughout the world. To better understand fungal development and pathogenicity, we have generated 7996 ESTs from three cDNA libraries. Two libraries were generated from carbon-(C-) and nitrogen- (N-) starved mycelia and one library was generated from cultures of maturing perithecia (P). In other fungal pathogens, starvation conditions have been shown to act as cues to induce infection-related gene expression. To assign putative function to cDNAs, sequences were initially assembled using StackPack. The estimated total number of genes identified from the three EST databases was 2110: 1088 contigs and 1022 singleton sequences. These 2110 sequences were compared to a yeast protein sequence reference set and to the GenBank nonredundant database using BLASTX. Based on presumptive gene function identified by this process, we found that the two starved cultures had similar, but not identical, patterns of gene expression, whereas the developmental cultures were distinct in their pattern of expression. Of the three libraries, the perithecium library had the greatest percentage (46%) of ESTS falling into the "unclassified" category. Homologues of some known fungal virulence or pathogenicity factors were found primarily in the N- and C-libraries. Comparisons also were made with ESTs from the related fungi, Neurospora crassa and Magnaporthe grisea and the genomic sequence of N. crassa.  相似文献   

4.
5.
Expressed Sequence Tag (EST) analysis has pioneered genome-wide gene discovery and expression profiling. In order to establish a gene expression index in the rice cultivar indica, we sequenced and analyzed 86,136 ESTs from nine rice cDNA libraries from the super hybrid cultivar LYP9 and its parental cultivars. We assembled these ESTs into 13,232 contigs and leave 8,976 singletons. Overall, 7,497 sequences were found similar to the existing sequences in GenBank and 14,711 are novel. These sequences are classified by molecular function, biological process and pathways according to the Gene Ontology. We compared our sequenced ESTs with the publicly available 95,000 ESTs from japonica, and found little sequence variation, despite the large difference between genome sequences. We then assembled the combined 173,000 rice ESTs for further analysis. Using the pooled ESTs, we compared gene expression in metabolism pathway between rice and Arabidopsis according to KEGG. We further profiled gene expression pattern  相似文献   

6.
7.
8.
Brown AC  Kai K  May ME  Brown DC  Roopenian DC 《Genomics》2004,83(3):528-539
  相似文献   

9.
10.
The public EST (expressed sequence tag) databases represent an enormous but heterogeneous repository of sequences, including many from a broad selection of plant species and a wide range of distinct varieties. The significant redundancy within large EST collections makes them an attractive resource for rapid pre-selection of candidate sequence polymorphisms. Here we present a strategy that allows rapid identification of candidate SNPs in barley (Hordeum vulgare L.) using publicly available EST databases. Analysis of 271,630 EST sequences from different cDNA libraries, representing 23 different barley varieties, resulted in the generation of 56,302 tentative consensus sequences. In all, 8171 of these unigene sequences are members of clusters with six or more ESTs. By applying a novel SNP detection algorithm (SNiPpER) to these sequences, we identified 3069 candidate inter-varietal SNPs. In order to verify these candidate SNPs, we selected a small subset of 63 present in 36 ESTs. Of the 63 SNPs selected, we were able to validate 54 (86%) using a direct sequencing approach. For further verification, 28 ESTs were mapped to distinct loci within the barley genome. The polymorphism information content (PIC) and nucleotide diversity () values of the SNPs identified by the SNiPpER algorithm are significantly higher than those that were obtained by random sequencing. This demonstrates the efficiency of our strategy for SNP identification and the cost-efficient development of EST-based SNP-markers.The first two authors contributed equally to this work  相似文献   

11.
12.
PEDB: the Prostate Expression Database.   总被引:6,自引:1,他引:5       下载免费PDF全文
The Prostate Expression Database (PEDB) is a curated relational database and suite of analysis tools designed for the study of prostate gene expression in normal and disease states. Expressed Sequence Tags (ESTs) and full-length cDNA sequences derived from more than 40 human prostate cDNA libraries are maintained and represent a wide spectrum of normal and pathological conditions. Detailed library information including tissue source, library construction methods, sequence diversity and abundance are available in a library archive. Prostate ESTs are assembled into distinct species groups using the multiple alignment program CAP2 and are annotated with information from the GenBank, dbEST and Unigene public sequence databases. Annotated sequences in PEDB are searched using the BLAST algorithm. The differential expression of each EST species can be viewed across all libraries using a Virtual Expression Analysis Tool (VEAT), a graphical user interface written in Java for intra- and inter-library species comparisons. PEDB may be accessed via the World Wide Web at http://www.mbt.washington.edu/PEDB/  相似文献   

13.
Linkage mapping of gene-associated SNPs to pig chromosome 11   总被引:3,自引:0,他引:3  
Single nucleotide polymorphisms (SNPs) were discovered in porcine expressed sequence tags (ESTs) orthologous to genes from human chromosome 13 (HSA13) and predicted to be located on pig chromosome 11 (SSC11). The SNPs were identified as sequence variants in clusters of EST sequences from pig cDNA libraries constructed in the Sino-Danish pig genome project. In total, 312 human gene sequences from HSA13 were used for similarity searches in our pig EST database. Pig ESTs showing significant similarity with HSA13 genes were clustered and candidate SNPs were identified. Allele frequencies for 26 SNPs were estimated in a group of 80 unrelated pigs from Danish commercial pig breeds: Duroc, Hampshire, Landrace and Large White. Eighteen of the 26 SNPs genotyped in the PiGMaP Reference Families were mapped by linkage analysis to SSC11. The EST-based SNPs published here are new genetic markers useful for linkage and association studies in commercial and experimental pig populations. This study represents the first gene-associated SNP linkage map of pig chromosome 11 and adds new comparative mapping information between SSC11 and HSA13. Furthermore, our data facilitate future studies aimed at the identification of interesting regions on pig chromosome 11, positional cloning and fine mapping of quantitative trait loci in pig.  相似文献   

14.
小麦尿卟啉原Ⅲ合成酶基因克隆及序列分析   总被引:2,自引:0,他引:2  
根据水稻已公布的尿卟啉原Ⅲ合成酶(UROS)基因和小麦EST的保守序列,设计特异性引物对小麦尿卟啉原Ⅲ合成酶基因的部分片段进行克隆,得到了364 bp的cDNA(命名为UROS1)。以UROS1作为种子进行电子克隆,得到一段长为1210 bp的cDNA序列,并设计特异性引物克隆到1个1077 bp cDNA序列。对该片段分析结果表明,克隆得到的小麦UROS基因包含了信号肽区和全长的成熟肽区。小麦UROS基因与水稻UROS基因的同源性为86%左右,其推导氨基酸序列与水稻和拟南芥蛋白序列同源性分别约91%和79%。动物、植物以及微生物间核酸序列的保守性较低,氨基酸序列保守性也不高,但都存在UROS保守结构域(Hem D)。进化分析显示,该酶在不同物种间的进化速度差异较大。  相似文献   

15.
Public and private EST (Expressed Sequence Tag) programs provide access to a large number of ESTs from a number of plant species, including Arabidopsis, corn, soybean, rice, wheat. In addition to the homology of each EST to genes in GenBank, information about homology to all other ESTs in the data base can be obtained. To estimate expression levels of genes represented in the DuPont EST data base we count the number of times each gene has been seen in different cDNA libraries, from different tissues, developmental stages or induction conditions. This quantitation of message levels is quite accurate for highly expressed messages and, unlike conventional Northern blots, allows comparison of expression levels between different genes. Lists of most highly expresses genes in different libraries can be compiled. Also, if EST data is available for cDNA libraries derived from different developmental stages, gene expression profiles across development can be assembled. We present an example of such a profile for soybean seed development. Gene expression data obtained from Electronic Northern analysis can be confirmed and extended beyond the realm of highly expressed genes by using high density DNA arrays. The ESTs identified as interesting can be arrayed on nylon or glass and probed with total labeled cDNA first strand from the tissue of interest. Two-color fluorescent labeling allows accurate mRNA ratio measurements. We are currently using the DNA array technology to study chemical induction of gene expression and the biosynthesis of oil, carbohydrate and protein in developing seeds.  相似文献   

16.
Discovery of single nucleotide polymorphisms (SNPs) requires analysis of redundant sequences such as those available in large public databases. The ability to detect SNPs, especially those of low frequency, is dependent on the depth and scale of the discovery effort. Large numbers of SNPs have been identified by mining large-scale EST surveys and whole genome sequencing projects. These surveys however are subject to ascertainment bias and the inherent errors in large-scale single pass sequencing efforts. For example, the number of steps involved in the construction and sequencing of cDNA libraries make ESTs highly error prone, resulting in an increased frequency of nonvalid SNPs obtained in these surveys. Sequences of mtDNA genes are often incorporated into cDNA libraries as an artifact of the library construction process and are typically either subtracted from cDNA libraries or are considered superfluous when evaluating the information content of EST datasets. Sequences of mtDNA genes provide a unique resource for the analysis of SNP parameters in EST projects. This study uses sequences from four turkey muscle cDNA libraries to demonstrate how mtDNA sequences gleaned from collections of ESTs can be used to estimate SNP parameters and thus help predict the validity of SNPs.  相似文献   

17.
Discovery of single nucleotide polymorphisms (SNPs) requires analysis of redundant sequences such as those available in large public databases. The ability to detect SNPs, especially those of low frequency, is dependent on the depth and scale of the discovery effort. Large numbers of SNPs have been identified by mining large-scale EST surveys and whole genome sequencing projects. These surveys however are subject to ascertainment bias and the inherent errors in large-scale single pass sequencing efforts. For example, the number of steps involved in the construction and sequencing of cDNA libraries make ESTs highly error prone, resulting in an increased frequency of nonvalid SNPs obtained in these surveys. Sequences of mtDNA genes are often incorporated into cDNA libraries as an artifact of the library construction process and are typically either subtracted from cDNA libraries or are considered superfluous when evaluating the information content of EST datasets. Sequences of mtDNA genes provide a unique resource for the analysis of SNP parameters in EST projects. This study uses sequences from four turkey muscle cDNA libraries to demonstrate how mtDNA sequences gleaned from collections of ESTs can be used to estimate SNP parameters and thus help predict the validity of SNPs.  相似文献   

18.
Using a strategy requiring only modest computational resources, wheat expressed sequence tag (EST) sequences from various sources were assembled into contigs and compared with a nonredundant barley sequence assembly, with ESTs, with complete draft genome sequences of rice and Arabidopsis thaliana, and with ESTs from other plant species. These comparisons indicate that (i) wheat sequences available from public sources represent a substantial proportion of the diversity of wheat coding sequences, (ii) prediction of open reading frames in the whole genome sequence improves when supplemented with EST information from other species, (iii) a substantial number of candidates for novel genes that are unique to wheat or related species can be identified, and (iv) a smaller number of genes can be identified that are common to monocots and dicots but absent from Arabidopsis. The sequences in the last group may have been lost from Arabidopsis after descendance from a common ancestor. Examples of potential novel wheat genes and Triticeae-specific genes are presented.  相似文献   

19.
Lin W  Yang HH  Lee MP 《Genomics》2005,86(5):518-527
Differential expression between the two alleles of an individual and between people with different genotypes has been commonly observed. Quantitative differences in gene expression between people may provide the genetic basis for the phenotypic difference between individuals and may be the primary cause of complex diseases. In this paper, we developed a computational method to identify genes that displayed allelic variation in gene expression in human EST libraries. To model allele-specific gene expression, we first identified EST libraries in which both A and B alleles were expressed and then identified allelic variation in gene expression based on the EST counts for each allele using a binomial test. Among 1107 SNPs that had a sufficient number of ESTs for the analysis, 524 (47%) displayed allelic variation in at least one cDNA library. We verified experimentally the allelic variation in gene expression for 6 of these SNPs. The frequency of allelic variation observed in EST libraries was similar to the previous studies using the SNP chip and primer extension method. We found that genes that displayed allelic variation were distributed throughout the human genome and were enriched in certain chromosome regions. The SNPs and genes identified in this study will provide a rich source for evaluating the effects of those SNPs and associated haplotypes in human health and diseases.  相似文献   

20.
Analysis of expressed sequence tags from oil palm (Elaeis guineensis)   总被引:3,自引:0,他引:3  
This is the first report of a systematic study of genes expressed by means of expressed sequence tag (EST) analysis in oil palm, a species of the Arecales order, a phylogenetically key clade of monocotyledons that is not widely represented in the sequence databases. Five different cDNA libraries were generated from male and female inflorescences, shoot apices and zygotic embryos and unidirectional systematic sequencing was performed. A total of 2411 valid EST sequences were thus obtained. Cluster analysis enabled the identification of 209 groups of related sequences and 1874 singletons. Putative functions were assigned to 1252 of the set of 2083 non-redundant ESTs obtained. The EST database described here is a first step towards gene discovery and cDNA array-based expression analysis in oil palm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号