首页 | 本学科首页   官方微博 | 高级检索  
   检索      

一种新的EST聚类方法
引用本文:张利达,袁德军,张建伟,王石平,张启发.一种新的EST聚类方法[J].遗传学报,2003,30(2):147-153.
作者姓名:张利达  袁德军  张建伟  王石平  张启发
作者单位:华中农业大学作物遗传改良国家重点实验室,武汉,430070
基金项目:国家重点基础发展规划 (973 )资助项目~~
摘    要:该研究发展了一种EST(expressed sequence tag)聚类方法(ESTClustering),用于分析大规模EST测序中所产生的大量数据,以获得高质量,非重复表达序列,该方法在聚类过程中采用MEGABLAST工具对一致序列进行序列同源比较,并用phrap程序对每一EST簇进行拼接检验。这一聚类策略能降低测序错误带来的影响,有效识别基因家族成员,并避免选择性剪接的干扰,与NCB(National Center for Biotechnology Information)的UniGene clustering)方法相比,ESTClustering的聚类结果可以更好地反映表达序列的多样性,用ESTClustering对112256条拟南芥EST聚类测试,产生23581个EST簇,其中13597个EST簇有对应拟南芥基因组编码序列,与该基因组中有EST作为依据的预测基因数目接近。应用该方法对收集的147191条水稻EST序列进行聚类,形成33896个EST簇。

关 键 词:EST聚类方法  一致序列  无冗余cDNA文库  基因测序
文章编号:0379-4172(2003)02-0147-07
修稿时间:2002年5月24日

A New Method for EST Clustering
Abstract:We developed an EST (expressed sequence tag) clustering method,ESTClustering,to generate high quality unique expressed sequence based on large scale EST sequencing.The method uses consensus sequences to sequence analyze with megablast and assemble each cluster with phrap in clustering process.The clustering strategy can efficiently identify gene family and alternate splicing forms of expressed sequences.It can also reduce the adverse effects caused by sequence errors.The ESTClustering method tends to provide more expressed gene forms comparing with the UniGene clustering method of the National Center for Biotechnology Information.Analysis of the 112 256 ESTs of Arabidopsis with ESTClustering produced 23 581 EST clusters.Among these Arabidopsis EST clusters,13 597 have corresponding genome coding sequences and this number is close to the number of genes predicted with Arabidopsis ESTs.Using this clustering method,a total of 147 191 rice ESTs were clustered into 33 896 groups.
Keywords:EST clustering  consensus sequence  non  redundant cDNA library  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号