首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 109 毫秒
1.
根据凋亡蛋白的亚细胞位置主要决定于它的氨基酸序列这一观点,基于局部氨基酸序列的n肽组分和序列的亲疏水性分布信息,采用离散增量结合支持向量机(ID_SVM)算法,对六类细胞凋亡蛋白的亚细胞位置进行预测。结果表明,在Re-substitution检验和Jackknife检验下,ID_SVM算法的总体预测成功率分别达到了94.6%和84.2%;在5-fold检验和10-fold检验下,其总体预测成功率也都达到了83%以上。通过比较ID和ID_SVM两种方法的预测能力发现,结合了支持向量机的离散增量算法能够改进预测成功率,结果表明ID_SVM是预测凋亡蛋白亚细胞位置的一种很有效的方法。  相似文献   

2.
集成改进KNN算法预测蛋白质亚细胞定位   总被引:1,自引:0,他引:1       下载免费PDF全文
基于Adaboost算法对多个相似性比对K最近邻(K-nearest neighbor,KNN)分类器集成实现蛋白质的亚细胞定位预测。相似性比对KNN算法分别以氨基酸组成、二肽、伪氨基酸组成为蛋白序列特征,在KNN的决策阶段使用Blast比对决定蛋白质的亚细胞定位。在Jackknife检验下,Adaboost集成分类算法提取3种蛋白序列特征,3种特征在数据集CH317和Gram1253的最高预测成功率分别为92.4%和93.1%。结果表明Adaboost集成改进KNN分类预测方法是一种有效的蛋白质亚细胞定位预测方法。  相似文献   

3.
运用生物信息学方法预测分析不同物种间STC2蛋白的理化性质、同源性以及人STC2蛋白亲水性、核定位序列,跨膜区域、信号肽结构、亚细胞定位,二级结构、三级结构、互作蛋白、GO注释分析。STC2蛋白由302个氨基酸组成,理论等电点为6.93,具有较强亲水性,在哺乳动物中较为保守,不存在核定位序列和跨膜结构,含有信号肽,主要集中在细胞内质网或分泌到胞外;STC2蛋白二级结构预测有11个α螺旋区和1个β折叠区,拉曼图表明三级预测结构可靠,互作蛋白及GO注释提示STC2可参与多种细胞生物学过程。通过对人STC2蛋白结构和功能的预测分析,为STC2蛋白的进一步研究提供一定的理论依据,也为STC2相关疾病的诊治提供新的思路。  相似文献   

4.
相似性比对预测蛋白质亚细胞区间   总被引:1,自引:0,他引:1  
王雄飞  张梁  薛卫  赵南  徐焕良 《微生物学通报》2016,43(10):2298-2305
【目的】对蛋白质所属的亚细胞区间进行预测,为进一步研究蛋白质的生物学功能提供基础。【方法】以蛋白质序列的氨基酸组成、二肽、伪氨基酸组成作为序列特征,用BLAST比对改进K最近邻分类算法(K-nearest neighbor,KNN)实现蛋白序列所属亚细胞区间预测。【结果】在Jackknife检验下,数据集CH317三种特征的成功率分别为91.5%、91.5%和89.3%,数据集ZD98成功率分别为93.9%、92.9%和89.8%。【结论】BLAST比对改进KNN算法是预测蛋白质亚细胞区间的一种有效方法。  相似文献   

5.
蛋白质亚细胞定位的识别   总被引:5,自引:2,他引:3  
根据蛋白质的亚细胞定位,将蛋白质分为12类,用离散量的数学理论,以蛋白质中400个氨基酸二联体数目构成离散源,通过计算离散增量预测蛋白质的亚细胞定位,用Self-consistency和Jackknife两种方法测试均获得较高的预测成功率。结果表明:Self-consistency方法预测成功率为84.5%,Jackknife方法预测成功率为81.1%。  相似文献   

6.
林昊 《生物信息学》2009,7(4):252-254
由于蛋白质亚细胞位置与其一级序列存在很强的相关性,利用多样性增量来描述蛋白质之间氨基酸组分和二肽组分的相似程度,采用修正的马氏判别式(这里称为IDQD方法)对分枝杆菌蛋白质的亚细胞位置进行了预测。利用Jackknife检验对不同序列相似度下的蛋白质数据集进行了预测研究,结果显示,当数据集的序列相似度小于等于70%时,算法的预测精度稳定在75%左右。在对整体852条蛋白质的预测成功率达到87.7%,这一结果优于已有算法的预测精度,说明IDQD是一种有效的分枝杆菌蛋白质亚细胞预测方法。  相似文献   

7.
为鉴定不同抗性苹果(Malusdomestica)品种响应轮纹病菌胁迫的抗性相关蛋白表达差异,以抗病品种华月及易感品种金冠为试材,采用高通量同位素标记定量(IBT)技术结合液相色谱-串联质谱(LC-MS)鉴定技术,对病原菌处理前后抗、感病品种叶片的蛋白质组差异表达进行分析,共鉴定出171个差异表达蛋白(DEPs)。GO富集及KEGG通路分析表明,在细胞组分、分子功能和生物过程3类中共注释到686个GO条目,其中52个DEPs注释于KEGG通路的18个显著差异途径(P0.05)。亚细胞定位预测分析表明, 171个DEPs中有170个分别定位于8类细胞器。蛋白功能注释分析表明, 46个DEPs注释于7类抗性相关蛋白,包括类甜蛋白、过氧化物酶、多酚氧化酶、过敏原蛋白、几丁质酶、内切葡聚糖酶以及主乳胶蛋白。此外,还对抗性相关蛋白的表达特点及基因定量结果进行了分析。该研究结果可为进一步解析抗、感病苹果品种应答轮纹病菌胁迫的抗性机制提供参考。  相似文献   

8.
研究表明,许多神经退行性疾病都与蛋白质在高尔基体中的定位有关,因此,正确识别亚高尔基体蛋白质对相关疾病药物的研制有一定帮助,本文建立了两类亚高尔基体蛋白质数据集,提取了氨基酸组分信息、联合三联体信息、平均化学位移、基因本体注释信息等特征信息,利用支持向量机算法进行预测,基于5-折交叉检验下总体预测成功率为87.43%。  相似文献   

9.
为鉴定不同抗性苹果(Malus domestica)品种响应轮纹病菌胁迫的抗性相关蛋白表达差异, 以抗病品种华月及易感品种金冠为试材, 采用高通量同位素标记定量(IBT)技术结合液相色谱-串联质谱(LC-MS)鉴定技术, 对病原菌处理前后抗、感病品种叶片的蛋白质组差异表达进行分析, 共鉴定出171个差异表达蛋白(DEPs)。GO富集及KEGG通路分析表明, 在细胞组分、分子功能和生物过程3类中共注释到686个GO条目, 其中52个DEPs注释于KEGG通路的18个显著差异途径(P<0.05)。亚细胞定位预测分析表明, 171个DEPs中有170个分别定位于8类细胞器。蛋白功能注释分析表明, 46个DEPs注释于7类抗性相关蛋白, 包括类甜蛋白、过氧化物酶、多酚氧化酶、过敏原蛋白、几丁质酶、内切葡聚糖酶以及主乳胶蛋白。此外, 还对抗性相关蛋白的表达特点及基因定量结果进行了分析。该研究结果可为进一步解析抗、感病苹果品种应答轮纹病菌胁迫的抗性机制提供参考。  相似文献   

10.
Dnd1的蛋白亚细胞定位及其对HeLa细胞增殖的抑制作用   总被引:1,自引:0,他引:1  
小鼠睾丸生殖细胞瘤易感基因Dnd1编码的蛋白是一个在进化中保守的RNA结合蛋白.为探讨小鼠Dnd1的蛋白亚细胞定位和对细胞增殖的影响及其机制, 利用生物信息学技术, 采用组合的亚细胞定位分析软件对Dnd1进行真核生物亚细胞定位预测; 利用融合绿色荧光蛋白(green fluorescent protein, GFP)定位的方法, 通过构建pEGFP-Dnd1重组质粒, 将重组质粒pEGFP-Dnd1转染HeLa细胞和GC-1细胞, 在荧光显微镜下观察Dnd1的蛋白亚细胞定位; 用MTT法和流式细胞技术测定Dnd1过表达对HeLa细胞的增殖能力的影响和细胞周期的改变; 在HeLa细胞系中检测Dnd1对AP-1转录活性的影响. 结果表明: ① 生物信息学预测Dnd1主要在细胞核表达, 在细胞质中也有少量表达; 荧光显微镜下观察发现,Dnd1蛋白主要定位在细胞核, 在细胞质中也有少量分布; ② Dnd1基因在HeLa细胞系中的过表达抑制细胞增殖和诱导细胞周期G1期阻滞;③ Dnd1抑制AP-1的转录活性,从而抑制AP-1介导的转录是Dnd1抑制细胞增殖的可能机制.本研究初步明确了Dnd1的蛋白亚细胞定位及其对HeLa细胞的生长抑制作用, 这为进一步研究Dnd1基因的功能建立基础.  相似文献   

11.

Background  

Gene Ontology (GO) annotation, which describes the function of genes and gene products across species, has recently been used to predict protein subcellular and subnuclear localization. Existing GO-based prediction methods for protein subcellular localization use the known accession numbers of query proteins to obtain their annotated GO terms. An accurate prediction method for predicting subcellular localization of novel proteins without known accession numbers, using only the input sequence, is worth developing.  相似文献   

12.
Predicting subcellular localization with AdaBoost Learner   总被引:1,自引:0,他引:1  
Protein subcellular localization, which tells where a protein resides in a cell, is an important characteristic of a protein, and relates closely to the function of proteins. The prediction of their subcellular localization plays an important role in the prediction of protein function, genome annotation and drug design. Therefore, it is an important and challenging role to predict subcellular localization using bio-informatics approach. In this paper, a robust predictor, AdaBoost Learner is introduced to predict protein subcellular localization based on its amino acid composition. Jackknife cross-validation and independent dataset test were used to demonstrate that Adaboost is a robust and efficient model in predicting protein subcellular localization. As a result, the correct prediction rates were 74.98% and 80.12% for the Jackknife test and independent dataset test respectively, which are higher than using other existing predictors. An online server for predicting subcellular localization of proteins based on AdaBoost classifier was available on http://chemdata.shu. edu.cn/sl12.  相似文献   

13.
Methods for predicting bacterial protein subcellular localization   总被引:1,自引:0,他引:1  
The computational prediction of the subcellular localization of bacterial proteins is an important step in genome annotation and in the search for novel vaccine or drug targets. Since the 1991 release of PSORT I--the first comprehensive algorithm to predict bacterial protein localization--many other localization prediction tools have been developed. These methods offer significant improvements in predictive performance over PSORT I and the accuracy of some methods now rivals that of certain high-throughput laboratory methods for protein localization identification.  相似文献   

14.
The subcellular localization of a protein can provide important information about its function within the cell. As eukaryotic cells and particularly mammalian cells are characterized by a high degree of compartmentalization, most protein activities can be assigned to particular cellular compartments. The categorization of proteins by their subcellular localization is therefore one of the essential goals of the functional annotation of the human genome. We previously performed a subcellular localization screen of 52 proteins encoded on human chromosome 21. In the current study, we compared the experimental localization data to the in silico results generated by nine leading software packages with different prediction resolutions. The comparison revealed striking differences between the programs in the accuracy of their subcellular protein localization predictions. Our results strongly suggest that the recently developed predictors utilizing multiple prediction methods tend to provide significantly better performance over purely sequence-based or homology-based predictions.  相似文献   

15.

Background  

The computational prediction of mycobacterial proteins' subcellular localization is of key importance for proteome annotation and for the identification of new drug targets and vaccine candidates. Several subcellular localization classifiers have been developed over the past few years, which have comprised both general localization and feature-based classifiers. Here, we have validated the ability of different bioinformatics approaches, through the use of SignalP 2.0, TatP 1.0, LipoP 1.0, Phobius, PA-SUB 2.5, PSORTb v.2.0.4 and Gpos-PLoc, to predict secreted bacterial proteins. These computational tools were compared in terms of sensitivity, specificity and Matthew's correlation coefficient (MCC) using a set of mycobacterial proteins having less than 40% identity, none of which are included in the training data sets of the validated tools and whose subcellular localization have been experimentally confirmed. These proteins belong to the TBpred training data set, a computational tool specifically designed to predict mycobacterial proteins.  相似文献   

16.
Many methods have been described to predict the subcellular location of proteins from sequence information. However, most of these methods either rely on global sequence properties or use a set of known protein targeting motifs to predict protein localization. Here, we develop and test a novel method that identifies potential targeting motifs using a discriminative approach based on hidden Markov models (discriminative HMMs). These models search for motifs that are present in a compartment but absent in other, nearby, compartments by utilizing an hierarchical structure that mimics the protein sorting mechanism. We show that both discriminative motif finding and the hierarchical structure improve localization prediction on a benchmark data set of yeast proteins. The motifs identified can be mapped to known targeting motifs and they are more conserved than the average protein sequence. Using our motif-based predictions, we can identify potential annotation errors in public databases for the location of some of the proteins. A software implementation and the data set described in this paper are available from http://murphylab.web.cmu.edu/software/2009_TCBB_motif/.  相似文献   

17.
Subcellular location of protein is constructive information in determining its function, screening for drug candidates, vaccine design, annotation of gene products and in selecting relevant proteins for further studies. Computational prediction of subcellular localization deals with predicting the location of a protein from its amino acid sequence. For a computational localization prediction method to be more accurate, it should exploit all possible relevant biological features that contribute to the subcellular localization. In this work, we extracted the biological features from the full length protein sequence to incorporate more biological information. A new biological feature, distribution of atomic composition is effectively used with, multiple physiochemical properties, amino acid composition, three part amino acid composition, and sequence similarity for predicting the subcellular location of the protein. Support Vector Machines are designed for four modules and prediction is made by a weighted voting system. Our system makes prediction with an accuracy of 100, 82.47, 88.81 for self-consistency test, jackknife test and independent data test respectively. Our results provide evidence that the prediction based on the biological features derived from the full length amino acid sequence gives better accuracy than those derived from N-terminal alone. Considering the features as a distribution within the entire sequence will bring out underlying property distribution to a greater detail to enhance the prediction accuracy.  相似文献   

18.
19.
MOTIVATION: Functional annotation of unknown proteins is a major goal in proteomics. A key annotation is the prediction of a protein's subcellular localization. Numerous prediction techniques have been developed, typically focusing on a single underlying biological aspect or predicting a subset of all possible localizations. An important step is taken towards emulating the protein sorting process by capturing and bringing together biologically relevant information, and addressing the clear need to improve prediction accuracy and localization coverage. RESULTS: Here we present a novel SVM-based approach for predicting subcellular localization, which integrates N-terminal targeting sequences, amino acid composition and protein sequence motifs. We show how this approach improves the prediction based on N-terminal targeting sequences, by comparing our method TargetLoc against existing methods. Furthermore, MultiLoc performs considerably better than comparable methods predicting all major eukaryotic subcellular localizations, and shows better or comparable results to methods that are specialized on fewer localizations or for one organism. AVAILABILITY: http://www-bs.informatik.uni-tuebingen.de/Services/MultiLoc/  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号