首页 | 本学科首页   官方微博 | 高级检索  
   检索      

基于序列特征的人类PolⅡ启动子理论预测
引用本文:杨科利,许强.基于序列特征的人类PolⅡ启动子理论预测[J].生命科学研究,2009,13(5):403-407.
作者姓名:杨科利  许强
作者单位:宝鸡文理学院物理系,中国陕西,宝鸡,721007
基金项目:宝鸡文理学院硕士启动项目 
摘    要:基于已知的人类PolII启动子序列数据,综合选取启动子序列内容和序列信号特征,构建启动子的支持向量机分类器.分别以启动子序列的6-mer频数作为离散源参数构建序列内容特征。同时选取24个位点的3-mer频数作为序列信号特征构建PWM,将所得到的两类参数输入支持向量机对人类启动子进行预测.用10折叠交叉检验和独立数据集来衡量算法的预测能力,相关系数指标达到95%以上,结果显示结合了支持向量机的离散增量算法能够有效的提高预测成功率,是进行真核生物启动子预测的一种很有效的方法.

关 键 词:启动子  离散增量  位置权重矩阵(PWM)  支持向量机(SVM)

Predicting Human Pol Ⅱ Promoter Based on Sequence Features
YANG Ke-li,XU Qiang.Predicting Human Pol Ⅱ Promoter Based on Sequence Features[J].Life Science Research,2009,13(5):403-407.
Authors:YANG Ke-li  XU Qiang
Abstract:Based on the six least increment diversity, three kinds of position weight matrix, and the percent of GC in the sequences, the content vectors and the signals vector were distilled from the promoter sequences. The vectors calculated were input into a support vector machine (SVM) algorithm to build a promoter classification model. The human Pol Ⅱ promoter sequences are predicted by using of support vector machine, the 10-fold cross-validation and the independent test data were used for validating the support vector machine model. The results showed that the overall prediction accuracies (sensitivity) and specificity were more than 95%. These results indicated that the increment of diversity and support vector machines algorithm is an effective method for predicting the Eukaryotic promoter sequences.
Keywords:promoter sequences  increment of diversity  position weight matrix (PWM)  support vector machines (SVM)
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号