首页 | 本学科首页   官方微博 | 高级检索  
 共查询到14条相似文献,搜索用时 78 毫秒
随机森林方法预测膜蛋白类型   总被引:2,自引:0,他引:2  
膜蛋白的类型与其功能是密切相关的,因此膜蛋白类型的预测是研究其功能的重要手段,从蛋白质的氨基酸序列出发对膜蛋白的类型进行预测有重要意义。文章基于蛋白质的氨基酸序列,将组合离散增量和伪氨基酸组分信息共同作为预测参数,采用随机森林分类器,对8类膜蛋白进行了预测。在Jackknife检验下的预测精度为86.3%,独立检验的预测精度为93.8%,取得了好于前人的预测结果。  相似文献   

DNA甲基化作为直接作用于DNA序列的一种表观遗传修饰,能够在不改变DNA分子一级结构的情况下影响基因表达,在生命活动中扮演着重要的角色.在哺乳动物中,DNA甲基化主要发生在C_pG二核苷酸的胞嘧啶上,并且在基因组中呈现不均匀分布.准确预测DNA甲基化位点有助于阐明DNA甲基化对基因表达的调控作用,并为肿瘤的早期诊断及治疗提供新的依据.本文应用离散增量结合二次判别分析的方法,对人类的C_pG二核苷酸甲基化状态进行了识别.5折交叉检验的整体准确率超过了80%,受试者操作特性曲线面积也达到了0.86.与现有方法相比,预测成功率显著提高.这说明离散增量结合二次判别分析方法适用于甲基化位点的预测;基因组序列中甲基化位点具有序列依赖性.  相似文献   

根据凋亡蛋白的亚细胞位置主要决定于它的氨基酸序列这一观点,基于局部氨基酸序列的n肽组分和序列的亲疏水性分布信息,采用离散增量结合支持向量机(ID_SVM)算法,对六类细胞凋亡蛋白的亚细胞位置进行预测。结果表明,在Re-substitution检验和Jackknife检验下,ID_SVM算法的总体预测成功率分别达到了94.6%和84.2%;在5-fold检验和10-fold检验下,其总体预测成功率也都达到了83%以上。通过比较ID和ID_SVM两种方法的预测能力发现,结合了支持向量机的离散增量算法能够改进预测成功率,结果表明ID_SVM是预测凋亡蛋白亚细胞位置的一种很有效的方法。  相似文献   

用离散增量结合支持向量机方法预测蛋白质亚细胞定位   总被引:3,自引:0,他引:3  
赵禹  赵巨东  姚龙 《生物信息学》2010,8(3):237-239,244
对未知蛋白的功能注释是蛋白质组学的主要目标。一个关键的注释是蛋白质亚细胞定位的预测。本文应用离散增量结合支持向量机(ID_SVM)的方法,对阳性革兰氏细菌蛋白的5类亚细胞定位点进行预测。在独立检验下,其总体预测成功率为89.66%。结果发现ID_SVM算法对预测的成功率有很大改进。  相似文献   

蛋白质超二级结构预测是三级结构预测的一个非常重要的中间步骤。本文从蛋白质的一级序列出发,对5793个蛋白质中的四类简单超二级结构进行预测,以位点氨基酸为参数,采用3种片段截取方式,分别用离散增量算法预测的结果不理想,将组合的离散增量值作为特征参数输入支持向量机,取得了较好的预测结果,5交叉检验的平均预测总精度达到83.0%,Matthew’s相关系数在0.71以上。  相似文献   

利用分散量的数学理论,提出了基于最小分散增量的蛋白质序列辨识方法.通过多种特征联合对蛋白质序列进行编码,并建立基于最小分散增量的分类器MID_OMP,应用于革兰氏阴性细菌外膜蛋白序列辨识.在数据集上的Jackknife测试中,MID_OMP辨识外膜蛋白和α螺旋跨膜蛋白的准确率达到95.7%,辨识外膜蛋白和球状蛋白的准确率达到91.0%;在14个细菌基因组内挖掘结果显示,MID_OMP具有较高的敏感性和特异性,预测结果的可信度明显优于另外一种OMPs挖掘工具TMBETA-GENOME.  相似文献   

Transmembrane proteins allow cells to extensively communicate with the external world in a very accurate and specific way. They form principal nodes in several signaling pathways and attract large interest in therapeutic intervention, as the majority pharmaceutical compounds target membrane proteins. Thus, according to the current genome annotation methods, a detailed structural/functional characterization at the protein level of each of the elements codified in the genome is also required. The extreme difficulty in obtaining high-resolution three-dimensional structures, calls for computational approaches. Here we review to which extent the efforts made in the last few years, combining the structural characterization of membrane proteins with protein bioinformatics techniques, could help describing membrane proteins at a genome-wide scale. In particular we analyze the use of comparative modeling techniques as a way of overcoming the lack of high-resolution three-dimensional structures in the human membrane proteome.  相似文献   

高效亲和膜色谱快速分析及小量制备蛋白质   总被引:3,自引:0,他引:3  
以甲基丙烯酸缩水甘油酯纤维素复合膜为基质,分别以蛋白A(Protein A)、人免疫球蛋白G(HIgG)、三嗪染料(Cibacron blue F3GA)、亚胺二乙酸铜离子为配基,用不同方法制备了适合于分析及小量制备的高效亲和膜色谱介质,并对高效亲和柱的基本性能及其应用于各种相应蛋白的定量测定情况进行了考察。利用这种方法可以针对不同的目标蛋白及所存在的环境采用不同的配基,对各种蛋白的定量测定及小量制备可达到较为满意的结果。  相似文献   

带4.2蛋白是一种重要的红细胞膜蛋白,与红细胞的形态、可变形性及携氧功能有至关重要的联系。它通过与带3蛋白(阴离子通道蛋白)、锚蛋白结合,稳定的连接在细胞膜的内表面,连接着膜骨架网架结构与细胞膜,是膜骨架与脂质双分子层连接的重要纽带。带4.2蛋白的缺失会引起球形或椭圆形红细胞增多症及不同程度的溶血性贫血,严重的情况需要摘除脾脏来进行治疗。近年来研究认为,带4.2蛋白在维持细胞膜骨架的完整性和稳定性方面扮演了重要角色。现对带4.2蛋白结构及功能的研究状况进行综述。  相似文献   

刘佳  蔡禄  邢永强 《生物信息学》2010,8(4):341-343,346
蛋白质是一切生命活动的物质基础,研究蛋白质的相互作用有助于理解生物过程的分子机制,阐明疾病的分子机理。本文依据蛋白质序列组分特征,应用基于多样性增量的二次判别分析方法,对人类的1 963对蛋白质相互作用进行了预测。自洽检验的各项预测指标均在79%以上,且交叉检验的总精度也大于60%,表明本算法可以用于蛋白质相互作用预测。  相似文献   

去垢剂在膜蛋白的提取纯化过程中起到必要的作用,对膜蛋白的聚合状态、结晶条件以及理化性质等方面都有较大影响.分析超速离心技术(analytical ultracentrifuge,AUC)通过测定溶液中膜蛋白-去垢剂复合物在离心场中的沉降运动轨迹,可以分析获得其沉降系数、摩尔质量、流体力学半径、结合常数等水力学和热力学性质,进而判断膜蛋白-去垢剂复合物的均一性及聚合状态.本文以嗜热菌来源的ATP结合转运蛋白(ABC transporter)TmrAB作为研究对象,利用分析超速离心技术结合分子排阻层析和冷冻电镜负染技术,研究其均一性、聚合状态以及去垢剂与膜蛋白的摩尔比.结果显示,在8倍临界胶束浓度(critical micelle concentration,CMC)的DDM条件下,TmrAB性质均一,并以异二聚体的单体形式存在,DDM与Tmr AB的摩尔比为116∶1.本研究表明,分析超速离心技术是一种测定膜蛋白分子质量、研究膜蛋白聚合状态的可靠手段.  相似文献   

集成改进KNN算法预测蛋白质亚细胞定位   总被引:1,自引:0,他引:1  
基于Adaboost算法对多个相似性比对K最近邻(K-nearest neighbor,KNN)分类器集成实现蛋白质的亚细胞定位预测。相似性比对KNN算法分别以氨基酸组成、二肽、伪氨基酸组成为蛋白序列特征,在KNN的决策阶段使用Blast比对决定蛋白质的亚细胞定位。在Jackknife检验下,Adaboost集成分类算法提取3种蛋白序列特征,3种特征在数据集CH317和Gram1253的最高预测成功率分别为92.4%和93.1%。结果表明Adaboost集成改进KNN分类预测方法是一种有效的蛋白质亚细胞定位预测方法。  相似文献   

Protein trans-splicing using split inteins is well established as a useful tool for protein engineering. Here we show, for the first time, that this method can be applied to a membrane protein under native conditions. We provide compelling evidence that the heptahelical proteorhodopsin can be assembled from two separate fragments consisting of helical bundles A and B and C, D, E, F, and G via a splicing site located in the BC loop. The procedure presented here is on the basis of dual expression and ligation in vivo. Global fold, stability, and photodynamics were analyzed in detergent by CD, stationary, as well as time-resolved optical spectroscopy. The fold within lipid bilayers has been probed by high field and dynamic nuclear polarization-enhanced solid-state NMR utilizing a 13C-labeled retinal cofactor and extensively 13C-15N-labeled protein. Our data show unambiguously that the ligation product is identical to its non-ligated counterpart. Furthermore, our data highlight the effects of BC loop modifications onto the photocycle kinetics of proteorhodopsin. Our data demonstrate that a correctly folded and functionally intact protein can be produced in this artificial way. Our findings are of high relevance for a general understanding of the assembly of membrane proteins for elucidating intramolecular interactions, and they offer the possibility of developing novel labeling schemes for spectroscopic applications.  相似文献   

The genetic diversity of 43 sources of Upland cotton germplasm with different parental origins, breeding periods, and ecological growing areas in China were studied on the basis of simple sequence repeat (SSR) markers. A total of 130 gene alleles with 80% polymorphism were detected from 36 SSR primers. The number of alleles per primer ranged from two to eight with an average of 3.6. The polymorphism information content (PIC) range was 0.278-0.865, with an average of 0.62. The average genotype diversity index (H') was 1.102, the highest was 2.039 and the lowest was 0.451. The average coefficient of the genetic similarity of SSR markers among source germplasm was 0.610, ranging from 0.409 to 0.865. These indicated that the genetic diversity at the genomic level of the selected source germplasm was rich, and was representative of the diversity of the germplasms, in general. The diversity at the genome level of the base germplasm from the second and third breeding periods was decreased compared to that of the first period, indicating that the cotton genetic background in China became narrow gradually. The diversity of SSR markers among the base germplasm from early maturity cotton growing areas in the north was higher than those from the Huanghe and Yangtze growing areas. The molecular marker genetic similarity index of the domestic varieties was higher than that in the introduced varieties, which indicates that the genetic diversity in domestic cultivars was lower than that in the introduced varieties. This study gives an overview of the genetic diversity of the cotton germplasm base in China, and provides a guide for breeders to develop new cultivars efficiently.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号