首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
EST(expressed sequence tags ,EST) 是一段长约150~500 bp的基因表达的外源序列片段,是由大规模随机挑取的cDNA克隆测序得到的组织或细胞基因组的表达序列标签。一个EST代表生物某一时期的某种组织或细胞的一个表达基因。本文主要综述了EST技术的原理方法,哺乳动物早期胚胎研究的理论基础以及EST技术在早期胚胎研究方面的应用,并讨论了利用EST进行研究分析的发展趋势。  相似文献   

2.
The expressed sequence tag (EST) data provide a powerful tool for identification of transcribed DNA sequences. However, as EST are relatively short, many exons are poorly covered by EST, thus reducing the utility of EST data. Recently, signature sequence tag (SST) fingerprints were proposed as an alternative to EST fingerprints. Given a fingerprint set of probes, SST of a clone is a subset of probes from the fingerprint set that hybridize with the clone. We demonstrate that besides being a powerful technique for screening cDNA libraries, SST technology provides for very accurate gene predictions. Even with a small fingerprint set (600-800 probes), SST-based gene recognition outperforms many conventional and EST-based methods. The increase in the size of the fingerprint set to 1500 probes provides almost perfect gene recognition. Even more importantly, SST-based gene predictions miss very few exons and, therefore, provide an opportunity to bypass the cDNA sequencing step on the way from finished genomic sequence to mutation detection in gene-hunting projects. Because SST data can be obtained in a highly parallel and inexpensive way, SST technology has a potential of complementing EST technology for gene hunting.  相似文献   

3.
4.
An optimized protocol for analysis of EST sequences   总被引:16,自引:1,他引:16  
  相似文献   

5.
EST及其应用   总被引:11,自引:0,他引:11  
陆佳韵  王秀琴 《生命科学》1999,11(4):186-188
随着HGP(HumanGenomeProject)的实施,人类基因组测序进展顺利,并有望于2003年提前完成。后基因组计划的重点之一在于基因组表达概况和功能的研究。EST(expressedsequencetags)是一组短的cDNA部分序列,是由大量随机取出的cDNA克隆一次测序得到的组织或细胞基因组的表达序列标签。其在基因组研究中的应用已相当广泛并具有良好的前景。该文就EST的产生、相关数据库和应用情况作一综述。  相似文献   

6.
植物基因组表达序列标签(EST)计划研究进展   总被引:62,自引:0,他引:62  
植物表达序列标签(EST)计划是随机挑选cDNA克隆,并对其3′或5′端进行大规模一次性测序,将得到的150~500 bp长度的DNA片段与数据库中的序列进行比较,获得对基因组结构、组织、表达等认识的基因组研究策略.就近年来国际植物EST计划的实施情况、植物EST计划的研究范围、生物信息学在EST研究中的应用、EST数据库及查询、植物EST研究中遇到的问题等方面内容进行了综述.  相似文献   

7.
基于EST的新基因克隆策略   总被引:1,自引:0,他引:1  
刘媛  蔡嘉斌  蒋国松  童强松 《遗传》2008,30(3):257-262
表达序列标签(expressed sequence tags, EST) 是从随机选择的cDNA 克隆进行单向测序获得的短的cDNA序列, 代表一个完整基因的一部分。随着生物信息学和基因定位的迅猛发展, EST已成为基因定位、基因克隆、基因表达分析的有力工具。近年来, 由于EST数据库的迅速扩张, 运用EST来克隆和定位基因, 使得新基因克隆的策略发生了革命性变革。尽管存在一些不足, 实践证明EST可大大加速新基因的发现与研究。本文将就EST技术尤其是它在新基因克隆中的应用策略作详细介绍。  相似文献   

8.
一种新的EST聚类方法   总被引:11,自引:0,他引:11  
该研究发展了一种EST(expressed sequence tag)聚类方法(ESTClustering),用于分析大规模EST测序中所产生的大量数据,以获得高质量,非重复表达序列,该方法在聚类过程中采用MEGABLAST工具对一致序列进行序列同源比较,并用phrap程序对每一EST簇进行拼接检验。这一聚类策略能降低测序错误带来的影响,有效识别基因家族成员,并避免选择性剪接的干扰,与NCB(National Center for Biotechnology Information)的UniGene clustering)方法相比,ESTClustering的聚类结果可以更好地反映表达序列的多样性,用ESTClustering对112256条拟南芥EST聚类测试,产生23581个EST簇,其中13597个EST簇有对应拟南芥基因组编码序列,与该基因组中有EST作为依据的预测基因数目接近。应用该方法对收集的147191条水稻EST序列进行聚类,形成33896个EST簇。  相似文献   

9.
ESTs or ‘expressed sequence tags’ are DNA sequences read from both ends of expressed gene fragments. The Merck-WashU EST Project and several other public EST projects are being performed to rapidly discover the complement of human genes, and make them easily accessible. These ESTs are widely used to discover novel members of gene families, to map genes to chromosomes as ‘sequence-tagged sites’ (STSs), and to identify mutations leading to heritable diseases. Informatic strategies for querying the EST databases are discussed, as well as the strengths and weaknesses of the EST data. There is a compelling need to build on the informatic synthesis of human gene data, and to devise facile methods for determining gene functions.  相似文献   

10.
EST-PAGE--managing and analyzing EST data   总被引:2,自引:0,他引:2  
EST-PAGE provides a bioinformatics solution for expressed sequence tags (EST) data entry, database management, GenBank submission, process control and data retrieval from a unified web interface that can be easily customized and adapted by groups working on diverse EST sequencing projects. AVAILABILITY: The system and source code are available upon request from the authors. Supplementary information: http://EST-PAGE.binf.gmu.edu  相似文献   

11.
EST clustering error evaluation and correction   总被引:4,自引:0,他引:4  
MOTIVATION: The gene expression intensity information conveyed by (EST) Expressed Sequence Tag data can be used to infer important cDNA library properties, such as gene number and expression patterns. However, EST clustering errors, which often lead to greatly inflated estimates of obtained unique genes, have become a major obstacle in the analyses. The EST clustering error structure, the relationship between clustering error and clustering criteria, and possible error correction methods need to be systematically investigated. RESULTS: We identify and quantify two types of EST clustering error, namely, Type I and II in EST clustering using CAP3 assembling program. A Type I error occurs when ESTs from the same gene do not form a cluster whereas a Type II error occurs when ESTs from distinct genes are falsely clustered together. While the Type II error rate is <1.5% for both 5' and 3' EST clustering, the Type I error in the 5' EST case is approximately 10 times higher than the 3' EST case (30% versus 3%). An over-stringent identity rule, e.g., P >/= 95%, may even inflate the Type I error in both cases. We demonstrate that approximately 80% of the Type I error is due to insufficient overlap among sibling ESTs (ISO error) in 5' EST clustering. A novel statistical approach is proposed to correct ISO error to provide more accurate estimates of the true gene cluster profile.  相似文献   

12.
13.
14.
Clustering expressed sequence tags (ESTs) is a powerful strategy for gene identification, gene expression studies and identifying important genetic variations such as single nucleotide polymorphisms. To enable fast clustering of large-scale EST data, we developed PaCE (for Parallel Clustering of ESTs), a software program for EST clustering on parallel computers. In this paper, we report on the design and development of PaCE and its evaluation using Arabidopsis ESTs. The novel features of our approach include: (i) design of memory efficient algorithms to reduce the memory required to linear in the size of the input, (ii) a combination of algorithmic techniques to reduce the computational work without sacrificing the quality of clustering, and (iii) use of parallel processing to reduce run-time and facilitate clustering of larger data sets. Using a combination of these techniques, we report the clustering of 168 200 Arabidopsis ESTs in 15 min on an IBM xSeries cluster with 30 dual-processor nodes. We also clustered 327 632 rat ESTs in 47 min and 420 694 Triticum aestivum ESTs in 3 h and 15 min. We demonstrate the quality of our software using benchmark Arabidopsis EST data, and by comparing it with CAP3, a software widely used for EST assembly. Our software allows clustering of much larger EST data sets than is possible with current software. Because of its speed, it also facilitates multiple runs with different parameters, providing biologists a tool to better analyze EST sequence data. Using PaCE, we clustered EST data from 23 plant species and the results are available at the PlantGDB website.  相似文献   

15.
16.
The genus Acanthamoeba can cause severe infections such as granulomatous amebic encephalitis and amebic keratitis in humans. However, little genomic information of Acanthamoeba has been reported. Here, we constructed Acanthamoeba expressed sequence tags (EST) database (Acanthamoeba EST DB) derived from our 4 kinds of Acanthamoeba cDNA library. The Acanthamoeba EST DB contains 3,897 EST generated from amebae under various conditions of long term in vitro culture, mouse brain passage, or encystation, and downloaded data of Acanthamoeba from National Center for Biotechnology Information (NCBI) and Taxonomically Broad EST Database (TBestDB). The almost reported cDNA/genomic sequences of Acanthamoeba provide stand alone BLAST system with nucleotide (BLAST NT) and amino acid (BLAST AA) sequence database. In BLAST results, each gene links for the significant information including sequence data, gene orthology annotations, relevant references, and a BlastX result. This is the first attempt for construction of Acanthamoeba database with genes expressed in diverse conditions. These data were integrated into a database (http://www.amoeba.or.kr).  相似文献   

17.
Analysis of the human expressed sequence tag (EST) database identified four clones that contain sequences of previously uncharacterized genes, members of the ATP-binding cassette (ABC) superfamily. Two new ABC genes (EST20237, 31252) are located at Chromosome (Chr) 1q42 and 1q25 respectively in humans, as determined by FISH; at locations distinct from previously mapped genes of this superfamily. Two additional clones, EST 600 and EST 1596, were found to represent different ATP-binding domains of the same gene, ABC2. This gene was localized to 9q34 in humans by FISH and to the proximal region of Chr 2 in mice by linkage analysis. All genes display extensive diversity in sequence and expression pattern. We present several approaches to characterizing EST clones and demonstrate that the analysis of EST clones from different tissues is a powerful approach to identify new members of important gene families. Some drawbacks of using EST databases, including chimerism of cDNA clones, are discussed.  相似文献   

18.
19.

Background

The continuous flow of EST data remains one of the richest sources for discoveries in modern biology. The first step in EST data mining is usually associated with EST clustering, the process of grouping of original fragments according to their annotation, similarity to known genomic DNA or each other. Clustered EST data, accumulated in databases such as UniGene, STACK and TIGR Gene Indices have proven to be crucial in research areas from gene discovery to regulation of gene expression.

Results

We have developed a new nucleotide sequence matching algorithm and its implementation for clustering EST sequences. The program is based on the original CLU match detection algorithm, which has improved performance over the widely used d2_cluster. The CLU algorithm automatically ignores low-complexity regions like poly-tracts and short tandem repeats.

Conclusion

CLU represents a new generation of EST clustering algorithm with improved performance over current approaches. An early implementation can be applied in small and medium-size projects. The CLU program is available on an open source basis free of charge. It can be downloaded from http://compbio.pbrc.edu/pti
  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号