首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
For comprehensive analysis of genes expressed in the model dicotyledonous plant, Arabidopsis thaliana, expressed sequence tags (ESTs) were accumulated. Normalized and size-selected cDNA libraries were constructed from aboveground organs, flower buds, roots, green siliques and liquid-cultured seedlings, respectively, and a total of 14,026 5'-end ESTs and 39,207 3'-end ESTs were obtained. The 3'-end ESTs could be clustered into 12,028 non-redundant groups. Similarity search of the non-redundant ESTs against the public non-redundant protein database indicated that 4816 groups show similarity to genes of known function, 1864 to hypothetical genes, and the remaining 5348 are novel sequences. Gene coverage by the non-redundant ESTs was analyzed using the annotated genomic sequences of approximately 10 Mb on chromosomes 3 and 5. A total of 923 regions were hit by at least one EST, among which only 499 regions were hit by the ESTs deposited in the public database. The result indicates that the EST source generated in this project complements the EST data in the public database and facilitates new gene discovery.  相似文献   

2.
3.
In an effort to expand the Gossypium hirsutum L. (cotton) expressed sequence tag (EST) database, ESTs representing a variety of tissues and treatments were sequenced. Assembly of these sequences with ESTs already in the EST database (dbEST, GenBank) identified 9675 cotton sequences not present in GenBank. Statistical analysis of a subset of these ESTs identified genes likely differentially expressed in stems, cotyledons, and drought-stressed tissues. Annotation of the differentially expressed cDNAs tentatively identified genes involved in lignin metabolism, starch biosynthesis and stress response, consistent with pathways likely to be active in the tissues under investigation. Simple sequence repeats (SSRs) were identified among these ESTs, and an inexpensive method was developed to screen genomic DNA for the presence of these SSRs. At least 69 SSRs potentially useful in mapping were identified. Selected amplified SSRs were isolated and sequenced. The sequences corresponded to the EST containing the SSRs, confirming that these SSRs will potentially map the gene represented by the EST. The ESTs containing SSRs were annotated to help identify the genes that may be mapped using these markers.  相似文献   

4.
For comprehensive analysis of genes expressed in a model legume, Lotus japonicus, a total of 22,983 5' end expressed sequence tags (ESTs) were accumulated from normalized and size-selected cDNA libraries constructed from young (2 weeks old) plants. The EST sequences were clustered into 7137 non-redundant groups. Similarity search against public non-redundant protein database indicated that 3302 groups showed similarity to genes of known function, 1143 groups to hypothetical genes, and 2692 were novel sequences. Homologues of 5 nodule-specific genes which have been reported in other legume species were contained in the collected ESTs, suggesting that the EST source generated in this study will become a useful tool for identification of genes related to legume-specific biological processes. The sequence data of individual ESTs are available at the web site: http://www.kazusa.or.jp/en/plant/lotus/EST/.  相似文献   

5.
Human bone marrow stromal cells (HBMSC) are pluripotent cells with the potential to differentiate into osteoblasts, chondrocytes, myelosupportive stroma, and marrow adipocytes. We used high-throughput DNA sequencing analysis to generate 4258 single-pass sequencing reactions (known as expressed sequence tags, or ESTs) obtained from the 5' (97) and 3' (4161) ends of human cDNA clones from a HBMSC cDNA library. Our goal was to obtain tag sequences from the maximum number of possible genes and to deposit them in the publicly accessible database for ESTs (dbEST of the National Center for Biotechnology Information). Comparisons of our EST sequencing data with nonredundant human mRNA and protein databases showed that the ESTs represent 1860 gene clusters. The EST sequencing data analysis showed 60 novel genes found only in this cDNA library after BLAST analysis against 3.0 million ESTs in NCBI's dbEST database. The BLAST search also showed the identified ESTs that have close homology to known genes, which suggests that these may be newly recognized members of known gene families. The gene expression profile of this cell type is revealed by analyzing both the frequency with which a message is encountered and the functional categorization of expressed sequences. Comparing an EST sequence with the human genomic sequence database enables assignment of an EST to a specific chromosomal region (a process called digital gene localization) and often enables immediate partial determination of intron/exon boundaries within the genomic structure. It is expected that high-throughput EST sequencing and data mining analysis will greatly promote our understanding of gene expression in these cells and of growth and development of the skeleton.  相似文献   

6.
A four-step procedure for the efficient and systematic mining of whole EST libraries for differentially expressed genes is presented. After eliminating redundant entries from the EST library under investigation (step 1), contigs of maximal length are built upon each remaining EST using about 4 000 000 public and proprietary ESTs (step 2). These putative genes are compared against a database comprising ESTs from 16 different tissues (both normal and tumour affected) to determine whether or not they are differentially expressed (step 3; electronic northern). Fisher's exact test is used to assess the significance of differential expression. In step 4, an attempt is made to characterise the contigs obtained in the assembly through database comparison. A case study of the CGAP library NCI_CGAP_Br1.1, a library made from three (well, moderately, and poorly differentiated) invasive ductal breast tumours (2126 ESTs in total) was carried out. Of the maximal contigs, 139 were found to be significantly (alpha = 0.05) over-expressed in breast tumour tissue, while 13 appeared to be down-regulated.  相似文献   

7.
8.
Maiti AK  Jorissen M  Bouvagnet P 《Genome biology》2001,2(7):research0026.1-research00269

Background

Immotile cilia syndrome (ICS) or primary ciliary dyskinesia (PCD) is an autosomal recessive disorder in humans in which the beating of cilia and sperm flagella is impaired. Ciliated epithelial cell linings are present in many tissues. To understand ciliary assembly and motility, it is important to isolate those genes involved in the process.

Results

Total RNA was isolated from cultured ciliated nasal epithelial cells after in vitro ciliogenesis and expressed sequenced tags (ESTs) were generated. The functions and locations of 63 of these ESTs were derived by BLAST from two public databases. These ESTs are grouped into various classes. One group has high homology not only with the mitochondrial genome but also with one or more chromosomal DNAs, suggesting that very similar genes, or genes with very similar domains, are expressed from both mitochondrial and nuclear DNA. A second class comprises genes with complete homology with part of a known gene, suggesting that they are the same genes. A third group has partial homology with domains of known genes. A fourth group, constituting 33% of the ESTs characterized, has no significant homology with any gene or EST in the database.

Conclusions

We have shown that sufficient information about the location of ESTs could be derived electronically from the recently completed human genome sequences. This strategy of EST localization should be significantly useful for mapping and identification of new genes in the forthcoming human genome sequences with the vast number of ESTs in the dbEST database.  相似文献   

9.
基于PC/Linux的核酸序列电子延伸系统的构建及其应用   总被引:5,自引:0,他引:5  
新基因全长cDNA序列的获得常常是分子生物学工作者面临的难题。人类基因组计划及其相关计划的实施导致了大量表达序列标签(EST)的产生。利用一定的生物信息学算法,这些EST序列往往可用来对新基因片段进行延伸。采用Linux操作系统,利用Blast软件和Phrap软件以及EST数据库在微机上构建了EST序列的电子延伸系统,并对来自于人胎肝的11386条EST序列和511条插入片段全长cDNA序列进行了电子延伸,结果显示8373条EST序列和389条插入片段全长cDNA序列得到了程度不等的延伸,部分结果通过RACE实验得到证实。该套系统可高效地、规模化进行EST序列的延伸,可为通过实验获得新基因全长cDNA序列提供重要线索。 Abstract:Normally it is difficult to obtain full-length cDNA sequence of novel genes.More and more expressed sequence tags(ESTs) have been obtained since the start-up of human genome project.Powerful system is badly needed for data mining on these EST sequences.Based on a personal computer coupled with Linux operating system and EST database,the Blast software and Phrap software were used to construct a platform for in silico elongation of ESTs in our lab.The performance was tested using 11386 EST sequences and 511 partial-length cDNA sequences.Results demonstrated that 8373 EST and 389 cDNA sequence were elongated using this system.Thus the platform seems to be a fast way for full-length cDNA sequence cloning of new genes.  相似文献   

10.
The generation of large numbers of partial cDNA sequences, or expressed sequence tags (ESTs), has provided a method with which to sample a large number of genes from an organism. More than 25,000 Arabidopsis thaliana ESTs have been deposited in public databases, producing the largest collection of ESTs for any plant species. We describe here the application of a method of reducing redundancy and increasing information content in this collection by grouping overlapping ESTs representing the same gene into a "contig" or assembly. The increased information content of these assemblies allows more putative identifications to be assigned based on the results of similarity searches with nucleotide and protein databases. The results of this analysis indicate that sequence information is available for approximately 12,600 nonoverlapping ESTs from Arabidopsis. Comparison of the assemblies with 953 Arabidopsis coding sequences indicates that up to 57% of all Arabidopsis genes are represented by an EST. Clustering analysis of these sequences suggests that between 300 and 700 gene families are represented by between 700 and 2000 sequences in the EST database. A database of the assembled sequences, their putative identifications, and cellular roles is available through the World Wide Web.  相似文献   

11.
12.
Wu XL  Griffin KB  Garcia MD  Michal JJ  Xiao Q  Wright RW  Jiang Z 《Gene》2004,340(2):213-225
The launch of large-scale chicken expressed sequence tags (EST) projects has placed the chicken in the lead for the number of EST sequences in agriculturally important animals. More than 451,000 chicken ESTs derived from over 158 libraries have been deposited in the NCBI dbEST database as of December 2003. But how many genes these ESTs represent and how they are expressed in different chicken tissues/organs remain undetermined. In the present research, we developed a human gene-based strategy for census of chicken orthologous genes and identification of their expression patterns. Among 34,157 human coding genes used in the study, BLAST analysis revealed that 11,066 genes provisionally matched 248,628 chicken ESTs. Based on the average EST abundance of the orthologous genes, the current public repository of chicken ESTs could represent 20,000 provisional genes. Analysis of gene expression in 14 single tissues/organs showed that approximately 15% of genes were expressed exclusively in single tissue/organ whereas the remaining 85% of genes were co-expressed in two or more tissues/organs. A majority (91.15%) of genes expressed in chicken embryos were also expressed at post-hatch stages, indicating that most genes activated in chicken embryos could serve housekeeping functions. Self-organizing maps (SOM) analysis organized 8807 provisional genes in selected chicken tissues into 98 clusters with each cluster being indicative of common regulatory factors and pathways. A total of 969 provisional orthologous genes were identified as preferentially expressed genes (PEGs) in various chicken tissues/organs (LOD>3.0). No doubt, the present study on gene expression patterns will provide insight into dynamics of metabolic pathways and tissue/organ programming and reprogramming in chickens.  相似文献   

13.
Expressed sequence tags (ESTs) from the marine red alga Gracilaria gracilis   总被引:2,自引:0,他引:2  
Expressed sequence tags (ESTs) are partial sequences of cDNAs, and can be used to characterize gene expression in organisms or tissues. We have constructed a 200-sequence EST database from vegetative thalli of Gracilaria gracilis, the first ESTs reported from any alga. This database contains recognizable ESTs corresponding to genes of carbohydrate metabolism (seven), amino acid metabolism (three), photosynthesis (five), nucleic acid synthesis, repair and processing (three), protein synthesis (14), protein degradation (six), cellular maintenance and stress response (three), other identifiable protein-coding genes (13) and 146 sequences for which significant matches were not found in existing sequence databases. We have already used this EST database to recover genes of carbohydrate biosynthesis from G. gracilis. This revised version was published online in August 2006 with corrections to the Cover Date.  相似文献   

14.
Analysis of expressed sequence tags from oil palm (Elaeis guineensis)   总被引:3,自引:0,他引:3  
This is the first report of a systematic study of genes expressed by means of expressed sequence tag (EST) analysis in oil palm, a species of the Arecales order, a phylogenetically key clade of monocotyledons that is not widely represented in the sequence databases. Five different cDNA libraries were generated from male and female inflorescences, shoot apices and zygotic embryos and unidirectional systematic sequencing was performed. A total of 2411 valid EST sequences were thus obtained. Cluster analysis enabled the identification of 209 groups of related sequences and 1874 singletons. Putative functions were assigned to 1252 of the set of 2083 non-redundant ESTs obtained. The EST database described here is a first step towards gene discovery and cDNA array-based expression analysis in oil palm.  相似文献   

15.
Expressed sequence tags (ESTs) represent 500-1000-bp-long sequences corresponding to mRNAs derived from different sources (cell lines, tissues, etc.). The human EST database contains over 8,000,000 sequences, with over 4,000,000,000 total nucleotides. RNA molecules are transcribed from a genomic DNA template; therefore, all ESTs should match corresponding genomes. Nevertheless, we have found in the human EST database approximately 11,000 ESTs not matching sequences in the human genome database. The presence of "trash" ESTs (TESTs) in the EST database could result from DNA or RNA contamination of the laboratory equipment, tissues, or cell lines. TESTs could also represent sequences from unidentified human genes or from species inhabiting the human body. Here, we attempt to identify the sources of human EST database contaminations. In particular, we discuss systematic contamination of the mammalian EST databases with sequences of plants.  相似文献   

16.
转化的大鼠胚胎成纤维细胞系差异表达基因的筛选研究   总被引:4,自引:5,他引:4  
来源于转化的大鼠胚胎成纤维细胞系的两株细胞,A1-5细胞与B4细胞相比表现出非常强的抗辐射性并伴随不同寻常强的G2延迟效应;用PCR选择性抑制消减杂交方法对这两株细胞进行差减,希望找到对A1-5细胞表现出的不同寻常的表型起关键作用的某一个或某一些基因。结果得到了160个差减转化子,逐个进行序列测定,并进行Dot blot杂交,共得到35个差异表达基因片段(EST)。通过对美国国家生物技术信息中心(NCBI)的非冗余序列库(NT)、鼠EST库及人EST库的BLAST进行同源检索,发现其中21个代表了尚未登录的新基因,另外14个分别与已知基因高度同源。  相似文献   

17.
Brown AC  Kai K  May ME  Brown DC  Roopenian DC 《Genomics》2004,83(3):528-539
  相似文献   

18.
Lotus japonicus has received increased attention as a potential model legume plant. In order to study gene expression in reproductive organs and to identify genes that play a crucial function in sexual reproduction, we constructed a cDNA library from immature flower buds containing anthers at the stage of developing tapetum cells in L. japonicus, and characterized 919 expressed sequence tags (ESTs) randomly selected from a cDNA library of the immature flower buds. The 919 ESTs analyzed were clustered into 821 non-redundant EST groups. As a result of a database search, 436 groups (53%) out of the 821 groups showed sequence similarity to genes registered in the public database. Out of these 436 groups, 109 groups showed similarity to genes encoding hypothetical proteins whose function had not yet been estimated. Three hundred eighty five groups (47%) showed no significant homology to known sequences and were classified as novel sequences. A comparison of 821 non-redundant EST sequences and EST sequences derived from the whole plant L. japonicus revealed that 474 EST sequences derived from immature flower buds were not found in the EST sequences of the whole plant. In order to confirm the expression pattern of potential reproductive-organ specific EST clones, nine clones, which were not matched to ESTs derived from the whole plant, were selected, and RT-PCR analysis was performed on these clones. As a result of RT-PCR, we found two novel anther specific clones. One clone was homologous to a gene encoding human cleft lip and palate associated transmembrane protein (CLPTM1) like protein, and the other clone did not show a significant similarity to any genes deposited in the public database. These results indicate that ESTs analyzed here represent a valuable resource for finding reproductive-organ specific genes in Lotus japonicus.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号