首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Human bone marrow stromal cells (HBMSC) are pluripotent cells with the potential to differentiate into osteoblasts, chondrocytes, myelosupportive stroma, and marrow adipocytes. We used high-throughput DNA sequencing analysis to generate 4258 single-pass sequencing reactions (known as expressed sequence tags, or ESTs) obtained from the 5' (97) and 3' (4161) ends of human cDNA clones from a HBMSC cDNA library. Our goal was to obtain tag sequences from the maximum number of possible genes and to deposit them in the publicly accessible database for ESTs (dbEST of the National Center for Biotechnology Information). Comparisons of our EST sequencing data with nonredundant human mRNA and protein databases showed that the ESTs represent 1860 gene clusters. The EST sequencing data analysis showed 60 novel genes found only in this cDNA library after BLAST analysis against 3.0 million ESTs in NCBI's dbEST database. The BLAST search also showed the identified ESTs that have close homology to known genes, which suggests that these may be newly recognized members of known gene families. The gene expression profile of this cell type is revealed by analyzing both the frequency with which a message is encountered and the functional categorization of expressed sequences. Comparing an EST sequence with the human genomic sequence database enables assignment of an EST to a specific chromosomal region (a process called digital gene localization) and often enables immediate partial determination of intron/exon boundaries within the genomic structure. It is expected that high-throughput EST sequencing and data mining analysis will greatly promote our understanding of gene expression in these cells and of growth and development of the skeleton.  相似文献   

2.
A compilation of soybean ESTs: generation and analysis.   总被引:18,自引:0,他引:18  
Whole-genome sequencing is fundamental to understanding the genetic composition of an organism. Given the size and complexity of the soybean genome, an alternative approach is targeted random-gene sequencing, which provides an immediate and productive method of gene discovery. In this study, more than 120000 soybean expressed sequence tags (ESTs) generated from more than 50 cDNA libraries were evaluated. These ESTs coalesced into 16928 contigs and 17336 singletons. On average, each contig was composed of 6 ESTs and spanned 788 bases. The average sequence length submitted to dbEST was 414 bases. Using only those libraries generating more than 800 ESTs each and only those contigs with 10 or more ESTs each, correlated patterns of gene expression among libraries and genes were discerned. Two-dimensional qualitative representations of contig and library similarities were generated based on expression profiles. Genes with similar expression patterns and, potentially, similar functions were identified. These studies provide a rich source of publicly available gene sequences as well as valuable insight into the structure, function, and evolution of a model crop legume genome.  相似文献   

3.
Zea mays DataBase (ZmDB) seeks to provide a comprehensive view of maize (corn) genetics by linking genomic sequence data with gene expression analysis and phenotypes of mutant plants. ZmDB originated in 1999 as the Web portal for a large project of maize gene discovery, sequencing and phenotypic analysis using a transposon tagging strategy and expressed sequence tag (EST) sequencing. Recently, ZmDB has broadened its scope to include all public maize ESTs, genome survey sequences (GSSs), and protein sequences. More than 170 000 ESTs are currently clustered into approximately 20 000 contigs and about an equal number of apparent singlets. These clusters are continuously updated and annotated with respect to potential encoded protein products. More than 100 000 GSSs are similarly assembled and annotated by spliced alignment with EST and protein sequences. The ZmDB interface provides quick access to analytical tools for further sequence analysis. Every sequence record is linked to several display options and similarity search tools, including services for multiple sequence alignment, protein domain determination and spliced alignment. Furthermore, ZmDB provides web-based ordering of materials generated in the project, including ESTs, ordered collections of genomic sequences tagged with the RescueMu transposon and microarrays of amplified ESTs. ZmDB can be accessed at http://zmdb.iastate.edu/.  相似文献   

4.
5.
一种新的EST聚类方法   总被引:11,自引:0,他引:11  
该研究发展了一种EST(expressed sequence tag)聚类方法(ESTClustering),用于分析大规模EST测序中所产生的大量数据,以获得高质量,非重复表达序列,该方法在聚类过程中采用MEGABLAST工具对一致序列进行序列同源比较,并用phrap程序对每一EST簇进行拼接检验。这一聚类策略能降低测序错误带来的影响,有效识别基因家族成员,并避免选择性剪接的干扰,与NCB(National Center for Biotechnology Information)的UniGene clustering)方法相比,ESTClustering的聚类结果可以更好地反映表达序列的多样性,用ESTClustering对112256条拟南芥EST聚类测试,产生23581个EST簇,其中13597个EST簇有对应拟南芥基因组编码序列,与该基因组中有EST作为依据的预测基因数目接近。应用该方法对收集的147191条水稻EST序列进行聚类,形成33896个EST簇。  相似文献   

6.

Background  

EST sequencing is a versatile approach for rapidly gathering protein coding sequences. They provide direct access to an organism's gene repertoire bypassing the still error-prone procedure of gene prediction from genomic data. Therefore, ESTs are often the only source for biological sequence data from taxa outside mainstream interest. The widespread use of ESTs in evolutionary studies and particularly in molecular systematics studies is still hindered by the lack of efficient and reliable approaches for automated ortholog predictions in ESTs. Existing methods either depend on a known species tree or cannot cope with redundancy in EST data.  相似文献   

7.
高等植物基因组中,大部分序列为非表达序列,基因序列所占的比例很小,了解基因在基因组中的分布是研究基因组结构的一个重要方面。在美国能源部资助下,一个毛果杨无性系的基因组测序已经完成并对公众发布。杨树全基因组序列的完成,为我们了解林木基因组中基因的分布提供了一个特例。在本文中,我们利用泊松分析对杨树基因组中基因在各个染色体上的密度进行了检测,结果表明杨树基因组中各条染色体的基因含量存在显著差异。杨树全基因组测序项目揭示现代杨树基因组起源于一次古全基因组复制事件(称为杨柳科基因组复制),所以杨树基因组不同染色体间存在很大的同源复制片段。但是我们的研究显示,杨树基因组中大多数高度同源的染色体上基因的密度与染色体间的同源性没有明显关系,这说明杨柳科全基因组复制事件后,各个高度同源染色体上的基因发生了流失,且基因流失的速率是不一样的。同时本文还对近九万条毛果杨EST序列进行了比对分析,结果显示这些EST序列覆盖的基因仅占杨树基因组中基因总数的16.8%左右。EST测序虽然是发现基因的一个重要手段,但小规模EST测序对基因的覆盖度很低,所以小规模EST测序的应用价值是有限的。  相似文献   

8.
9.
10.
Expressed Sequence Tags (ESTs) are short, usually unedited sequences obtained by single-pass sequencing of cDNA clones from any cDNA library. Analyzing and comparing ESTs can provide information on gene expression, function and evolution. Large-scale EST sequencing has become an attractive alternative to plant genome sequencing. Currently, plant EST collections comprise over 3.8 million sequences from about 200 species. They have proved to be a valuable tool for gene discovery and plant metabolism analysis. Several plant-specific EST databases have been created which provide access to sequence data and bioinformatics-based tools for data mining. Searching EST collections allows pre-selection of genes for preparing cDNA arrays, targeted to bring maximum information on specialized processes, like stress response, symbiotic nitrogen fixation etc. Also, ESt-based molecular markers such as SNP, SSR, and indels are fast developing tools for breeders and researchers.  相似文献   

11.
MOTIVATION: High accuracy of data always governs the large-scale gene discovery projects. The data should not only be trustworthy but should be correctly annotated for various features it contains. Sequence errors are inherent in single-pass sequences such as ESTs obtained from automated sequencing. These errors further complicate the automated identification of EST-related sequencing. A tool is required to prepare the data prior to advanced annotation processing and submission to public databases. RESULTS: This paper describes ESTprep, a program designed to preprocess expressed sequence tag (EST) sequences. It identifies the location of features present in ESTs and allows the sequence to pass only if it meets various quality criteria. Use of ESTprep has resulted in substantial improvement in accurate EST feature identification and fidelity of results submitted to GenBank. AVAILABILITY: The program is freely available for download from http://genome.uiowa.edu/pubsoft/software.html  相似文献   

12.
13.

Background  

ESTs are a tremendous resource for determining the exon-intron structures of genes, but even extensive EST sequencing tends to leave many exons and genes untouched. Gene prediction systems based exclusively on EST alignments miss these exons and genes, leading to poor sensitivity. De novo gene prediction systems, which ignore ESTs in favor of genomic sequence, can predict such "untouched" exons, but they are less accurate when predicting exons to which ESTs align. TWINSCAN is the most accurate de novo gene finder available for nematodes and N-SCAN is the most accurate for mammals, as measured by exact CDS gene prediction and exact exon prediction.  相似文献   

14.
Because of their crucial phylogenetic positions, hagfishes, sharks, and bichirs are recognized as key taxa in our understanding of vertebrate evolution. The expression patterns of the regulatory genes involved in developmental patterning have been analyzed in the context of evolutionary developmental studies. However, in a survey of public sequence databases, we found that the large-scale sequence data for these taxa are still limited. To address this deficit, we used conventional Sanger DNA sequencing and a next-generation sequencing technology based on 454 GS FLX sequencing to obtain expressed sequence tags (ESTs) of the Japanese inshore hagfish (Eptatretus burgeri; 161,482 ESTs), cloudy catshark (Scyliorhinus torazame; 165,819 ESTs), and gray bichir (Polypterus senegalus; 34,336 ESTs). We deposited the ESTs in a newly constructed database, designated the "Vertebrate TimeCapsule." The ESTs include sequences from genes that can be effectively used in evolutionary developmental studies; for instance, several encode cartilaginous extracellular matrix proteins, which are central to an understanding of the ways in which evolutionary processes affected the skeletal elements, whereas others encode regulatory genes involved in craniofacial development and early embryogenesis. Here, we discuss how hagfishes, sharks, and bichirs contribute to our understanding of vertebrate evolution, we review the current status of the publicly available sequence data for these three taxa, and we introduce our EST projects and newly developed database.  相似文献   

15.
16.
17.
The sequencing of expressed sequence tags (ESTs) from Xenopus laevis has lagged behind efforts on many other common experimental organisms and man, partly because of the pseudotetraploid nature of the Xenopus genome. Nonetheless, large collections of Xenopus ESTs would be useful in gene discovery, oligonucleotide-based knockout studies, gene chip analyses of normal and perturbed development, mapping studies in the related diploid frog X. tropicalis, and for other reasons. We have created a normalized library of cDNAs from unfertilized Xenopus eggs. These cells contain all of the information necessary for the first several cell divisions in the early embryo, as well as much of the information needed for embryonic pattern formation and cell fate determination. To date, we have successfully sequenced 13,879 ESTs out of 16,607 attempts (83.6% success rate), with an average sequence read length of 508 bp. Using a fragment assembly program, these ESTs were assembled into 8,985 'contigs' comprised of up to 11 ESTs each. When these contigs were used to search publicly available databases, 46.2% bore no relationship to protein or DNA sequences in the database at the significance level of 1e-6. Examination of a sample of 100 of the assembled contigs revealed that most ( approximately 87%) were comprised of two apparent allelic variants. Expression profiles of 16 of the most prominent contigs showed that 12 exhibited some degree of zygotic expression. These findings have implications for sequence-specific applications for Xenopus ESTs, particularly the use of allele-specific oligonucleotides for knockout studies, differential hybridization techniques such as gene chip analysis, and the establishment of accurate nomenclature and databases for this species.  相似文献   

18.
19.
To isolate useful and interesting plant genes in large quantities, random sequencing of cDNA clones from potato leaf library treated with ethylene was performed. Partial sequences of randomly selected 210 clones with the insert of longer than 500 base pair (bp) as well as poly (A) tail have been compared with sequences in GeneBank, EMBL and DDBJ nucleic acid databases and fostered 193 expressed sequence tags (ESTs). The 210 cDNA clones identified are related to various aspect of metabolic pathways such as glycolysis, amino acid synthesis, translation mechanism, ribosome synthesis, hormone response, stress response, regulation of gene expression, and signal transduction. Among the 193 ESTs, 12 ESTs (29 cDNA clones) appeared more than once and 181 ESTs appeared once regarded as a solitary group. Out of 210 clones, 29 clones (13.8%) have no similarity to the known nucleotide sequences and could serve as a potentially useful resource for plant molecular biology referring to particular genes. Nucleotide sequencing to generate more ESTs from ethylene-induced as well as non-induced potato leaf is in progress as well.  相似文献   

20.
The molecular ecologist's guide to expressed sequence tags   总被引:12,自引:0,他引:12  
Genomics and bioinformatics have great potential to help address numerous topics in ecology and evolution. Expressed sequence tags (ESTs) can bridge genomics and molecular ecology because they can provide a means of accessing the gene space of almost any organism. We review how ESTs have been used in molecular ecology research in the last several years by providing sequence data for the design of molecular markers, genome-wide studies of gene expression and selection, the identification of candidate genes underlying adaptation, and the basis for studies of gene family and genome evolution. Given the tremendous recent advances in inexpensive sequencing technologies, we predict that molecular ecologists will increasingly be developing and using EST collections in the years to come. With this in mind, we close our review by discussing aspects of EST resource development of particular relevance for molecular ecologists.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号