首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 250 毫秒
1.
基于模板的模建方法是蛋白质结构预测领域中最为准确有效的方法,该类方法的成功与否对模板质量的要求较高。为待预测序列找寻合适的模板,本文提出了一种profile-profile比对的方法将查询序列同模板库中的已知结构蛋白进行比对,然后根据比对结果的Z-score得分高低顺序挑选出合适的模板。结果表明:本文的profile-profile比对方法在测试集上的性能明显优于PSI-BLAST,相比PSI-BLAST在测试集上的准确度提高了约14.3%,配对t检验的结果表明准确度的提高具有统计显著性。从而得出如下结论:本文的profile-profile比对方法可以用于为序列相似性较低的待预测序列搜索远距离同源模板,并用于指导后续的三级结构预测。  相似文献   

2.
目的:该文预测了来源于嗜热脂肪芽孢杆菌(Bacillus stearothermophilus)ATCC 12016编码α-葡萄糖苷酶序列的三维结构,设计突变位点,构建突变体模型,并对预测三维结构与突变体进行了评估与分析.方法:分析了α-葡萄糖苷酶的核酸序列与蛋白序列,确定了α-葡萄糖苷酶蛋白序列的同源性与保守区特征;利用来源于蜡状芽胞杆菌(Bacillus cereus)ATCC 7064寡聚-1,6-葡萄糖苷酶三维蛋白结构作为模板,同时基于同源模建方法对α-葡萄糖苷酶序列的三维结构与突变突变体模型进行了结构预测.结果:对预测的三维结构与突变体进行评估与分析,表明预测与设计的结构达到合理化标准.结论:基于以上研究结果,构建α-葡萄糖苷酶的三维结构模型是合理的,为蛋白质工程应用建立了理论研究平台.  相似文献   

3.
基于知识的蛋白质结构预测   总被引:5,自引:0,他引:5  
介绍了近几年基于知识的蛋白质三维结构预测方法及其进展.目前,基于知识的结构预测方法主要有两类,一类是同源蛋白模建,这种技术比较成熟,模建的结果可靠性比较高,但只适用于同源性比较高的目标序列的模建;另一类方法即蛋白质逆折叠技术,主要包括3D profile方法和基于势函数的方法,给出的是目标蛋白质的空间走向,它主要可用于序列同源性比较低的蛋白质的结构预测.  相似文献   

4.
同源建模关键步骤的研究动态   总被引:1,自引:0,他引:1  
应用同源建模的蛋白质结构预测已经成为一种快速获得蛋白质结构的技术,这种技术也将成为完成结构基因组计划的有力工具.同源建模是指寻找与目标序列同源而且有实验测定结构的蛋白质作为模板,从而构建目标序列的结构模型的方法.限制这种方法的应用主要是同源建模的关键步骤,即目标与模板之间序列比对和环区建模的准确性.当模型的准确性达到令人信服的程度时,更为精确的计算机辅助药物设计和改造蛋白质,甚至设计全新功能的蛋白质将成为可能.综述了从算法和策略上提高同源建模关键步骤准确性的研究进展.  相似文献   

5.
蛋白质折叠类型识别方法研究   总被引:1,自引:0,他引:1  
蛋白质折叠类型识别是一种分析蛋白质结构的重要方法.以序列相似性低于25%的822个全B类蛋白为研究对象,提取核心结构二级结构片段及片段问氢键作用信息为折叠类型特征参数,构建全B类蛋白74种折叠类型模板数据库.定义查询蛋白与折叠类型模板间二级结构匹配函数SS、氢键作用势函数BP及打分函数P,P值最小的模板所对应的折叠类型为查询蛋白的折叠类型.从SCOP1.69中随机抽取三组、每组50个全β类蛋白结构域进行预测,分辨精度分别为56%、56%和42%;对Ding等提供的检验集进行预测,总分辨精度为61.5%.结果和比对表明,此方法是一种有效的折叠类型识别方法.  相似文献   

6.
提出了一种新的蛋白质二级结构预测方法. 该方法从氨基酸序列中提取出和自然语言中的“词”类似的与物种相关的蛋白质二级结构词条, 这些词条形成了蛋白质二级结构词典, 该词典描述了氨基酸序列和蛋白质二级结构之间的关系. 预测蛋白质二级结构的过程和自然语言中的分词和词性标注一体化的过程类似. 该方法把词条序列看成是马尔科夫链, 通过Viterbi算法搜索每个词条被标注为某种二级结构类型的最大概率, 其中使用词网格描述分词的结果, 使用最大熵马尔科夫模型计算词条的二级结构概率. 蛋白质二级结构预测的结果是最优的分词所对应的二级结构类型. 在4个物种的蛋白质序列上对这种方法进行测试, 并和PHD方法进行比较. 试验结果显示, 这种方法的Q3准确率比PHD方法高3.9%, SOV准确率比PHD方法高4.6%. 结合BLAST搜索的局部相似的序列可以进一步提高预测的准确率. 在50个CASP5目标蛋白质序列上进行测试的结果是: Q3准确率为78.9%, SOV准确率为77.1%. 基于这种方法建立了一个蛋白质二级结构预测的服务器, 可以通过http://www.insun.hit.edu.cn:81/demos/biology/index.html来访问.  相似文献   

7.
中国水仙凝集素基因NTA的克隆、序列分析及蛋白结构预测   总被引:1,自引:0,他引:1  
凝集素是一种糖专一性结合蛋白,它可以识别不同的糖类.植物凝集素是植物防御系统重要的组成部分.本研究采用RT-PCR的方法,从中国水仙花蕾中克隆了凝集素基因NTA,运用生物信息学方法对其核苷酸序列、编码的氨基酸序列进行分析以及对其蛋白结构进行预测.结果表明,得到的NTA基因全长698 bp,包含一个完整的开放阅读框516bp.该基因编码一个含有172个氨基酸的凝集素前体蛋白,该前体蛋白的等电点和分子量分别为5.84和18 615.19 Da.序列比对结果表明该基因编码的蛋白与其他单子叶植物如杂种水仙、雪花莲、君子兰、石蒜花和孤挺花的凝集素蛋白的同源性较高,分别为84%、80%、77%、78%以及82%.蛋白结构预测表明,中国水仙凝集素蛋白与洋水仙凝集素蛋白在结构上非常相似.对该基因编码的蛋白进行分析及蛋白结构模拟可知,该蛋白含有三个特殊的功能结构域和alpha-D卜甘露糖结合表面(QXDXNXVXY).  相似文献   

8.
蛋白质结构预测研究进展   总被引:1,自引:0,他引:1  
蛋白质结构预测是生物信息学当前的主要挑战之一.按照蛋白质结构预测对PDB数据 库信息的依赖程度,可以将其划分成两类:模板依赖模型和从头预测方法.其中模板依赖模 型又可以分为同源模型与穿线法.本文介绍了各种预测方法主要步骤,分析了制约各种方法 的瓶颈,及其研究进展.同源模型所取得的结构精度较高,但其对模板依赖性强;用于低同 源性的穿线法是模板依赖的模型重要的研究方向;从头预测法中统计学函数与物理函数的综 合使用取得了很好的效果,但是对于超过150个残基的片段,依然是巨大的挑战.  相似文献   

9.
幽门螺杆菌Lpp20蛋白的生物信息学分析   总被引:2,自引:0,他引:2  
目的:分析幽门螺杆菌Lpp20蛋白的主要化学和免疫学分子特征,为基因工程疫苗和诊断抗原的研究奠定基础。方法:根据Lpp20蛋白的氨基酸序列,应用生物信息学工具分析其蛋白序列,预测其信号肽、跨膜区、疏水性、二级结构、三级结构等性质。结果:Lpp20蛋白具有一段信号肽、脂蛋白信号肽酶切位点及脂盒模体,没有跨膜区,可能是一个外周膜蛋白;Lpp20蛋白的二级结构以α螺旋为主,其三级结构为一个致密的球状。结论:为基于幽门螺杆菌Lpp20蛋白的疫苗开发打下了基础。  相似文献   

10.
目的:基于生物信息学预测人线粒体转录终止因子3(hMTERF3)蛋白的结构与功能。方法:利用GenBank、Uniprot、ExPASy、SWISS-PROT数据库资源和不同的生物信息学软件对hMTERF3蛋白进行系统研究,包括hMTERF3的理化性质、跨膜区和信号肽、二级结构功能域、亚细胞定位、蛋白质的功能分类预测、同源蛋白质多重序列比对、系统发育树构建、三级结构同源建模。结果:软件预测hMTERF3蛋白的相对分子质量为47.97×103,等电点为8.60,不具信号肽和跨膜区;二级结构分析显示主要为螺旋和无规则卷曲,包含6个MTERF基序,三级结构预测结果与二级结构预测结果相符;亚细胞定位分析结果显示该蛋白定位于人线粒体;功能分类预测其为转运和结合蛋白,参与基因转录调控;同源蛋白质多重序列比对和进化分析显示,hMTERF3蛋白与大鼠、小鼠等哺乳动物的MTERF3蛋白具有高度同源性,在系统发育树上聚为一类。结论:hMTERF3蛋白的生物信息学分析为进一步开展对该蛋白的结构和功能的实验研究提供了理论依据。  相似文献   

11.
Jie Hou  Tianqi Wu  Renzhi Cao  Jianlin Cheng 《Proteins》2019,87(12):1165-1178
Predicting residue-residue distance relationships (eg, contacts) has become the key direction to advance protein structure prediction since 2014 CASP11 experiment, while deep learning has revolutionized the technology for contact and distance distribution prediction since its debut in 2012 CASP10 experiment. During 2018 CASP13 experiment, we enhanced our MULTICOM protein structure prediction system with three major components: contact distance prediction based on deep convolutional neural networks, distance-driven template-free (ab initio) modeling, and protein model ranking empowered by deep learning and contact prediction. Our experiment demonstrates that contact distance prediction and deep learning methods are the key reasons that MULTICOM was ranked 3rd out of all 98 predictors in both template-free and template-based structure modeling in CASP13. Deep convolutional neural network can utilize global information in pairwise residue-residue features such as coevolution scores to substantially improve contact distance prediction, which played a decisive role in correctly folding some free modeling and hard template-based modeling targets. Deep learning also successfully integrated one-dimensional structural features, two-dimensional contact information, and three-dimensional structural quality scores to improve protein model quality assessment, where the contact prediction was demonstrated to consistently enhance ranking of protein models for the first time. The success of MULTICOM system clearly shows that protein contact distance prediction and model selection driven by deep learning holds the key of solving protein structure prediction problem. However, there are still challenges in accurately predicting protein contact distance when there are few homologous sequences, folding proteins from noisy contact distances, and ranking models of hard targets.  相似文献   

12.
Template-based methods for predicting protein structure provide models for a significant portion of the protein but often contain insertions or chain ends (InsEnds) of indeterminate conformation. The local structure prediction "problem" entails modeling the InsEnds onto the rest of the protein. A well-known limit involves predicting loops of ≤12 residues in crystal structures. However, InsEnds may contain as many as ~50 amino acids, and the template-based model of the protein itself may be imperfect. To address these challenges, we present a free modeling method for predicting the local structure of loops and large InsEnds in both crystal structures and template-based models. The approach uses single amino acid torsional angle "pivot" moves of the protein backbone with a C(β) level representation. Nevertheless, our accuracy for loops is comparable to existing methods. We also apply a more stringent test, the blind structure prediction and refinement categories of the CASP9 tournament, where we improve the quality of several homology based models by modeling InsEnds as long as 45 amino acids, sizes generally inaccessible to existing loop prediction methods. Our approach ranks as one of the best in the CASP9 refinement category that involves improving template-based models so that they can function as molecular replacement models to solve the phase problem for crystallographic structure determination.  相似文献   

13.
蛋白质折叠识别算法是蛋白质三维结构预测的重要方法之一,该方法在生物科学的许多方面得到卓有成效的应用。在过去的十年中,我们见证了一系列基于不同计算方式的蛋白质折叠识别方法。在这些计算方法中,机器学习和序列谱-序列谱比对是两种在蛋白质折叠中应用较为广泛和有效的方法。除了计算方法的进展外,不断增大的蛋白质结构数据库也是蛋白质折叠识别的预测精度不断提高的一个重要因素。在这篇文章中,我们将简要地回顾蛋白质折叠中的先进算法。另外,我们也将讨论一些可能可以应用于改进蛋白质折叠算法的策略。  相似文献   

14.
Substantial progresses in protein structure prediction have been made by utilizing deep-learning and residue-residue distance prediction since CASP13. Inspired by the advances, we improve our CASP14 MULTICOM protein structure prediction system by incorporating three new components: (a) a new deep learning-based protein inter-residue distance predictor to improve template-free (ab initio) tertiary structure prediction, (b) an enhanced template-based tertiary structure prediction method, and (c) distance-based model quality assessment methods empowered by deep learning. In the 2020 CASP14 experiment, MULTICOM predictor was ranked seventh out of 146 predictors in tertiary structure prediction and ranked third out of 136 predictors in inter-domain structure prediction. The results demonstrate that the template-free modeling based on deep learning and residue-residue distance prediction can predict the correct topology for almost all template-based modeling targets and a majority of hard targets (template-free targets or targets whose templates cannot be recognized), which is a significant improvement over the CASP13 MULTICOM predictor. Moreover, the template-free modeling performs better than the template-based modeling on not only hard targets but also the targets that have homologous templates. The performance of the template-free modeling largely depends on the accuracy of distance prediction closely related to the quality of multiple sequence alignments. The structural model quality assessment works well on targets for which enough good models can be predicted, but it may perform poorly when only a few good models are predicted for a hard target and the distribution of model quality scores is highly skewed. MULTICOM is available at https://github.com/jianlin-cheng/MULTICOM_Human_CASP14/tree/CASP14_DeepRank3 and https://github.com/multicom-toolbox/multicom/tree/multicom_v2.0 .  相似文献   

15.
CASP (critical assessment of structure prediction) assesses the state of the art in modeling protein structure from amino acid sequence. The most recent experiment (CASP13 held in 2018) saw dramatic progress in structure modeling without use of structural templates (historically “ab initio” modeling). Progress was driven by the successful application of deep learning techniques to predict inter-residue distances. In turn, these results drove dramatic improvements in three-dimensional structure accuracy: With the proviso that there are an adequate number of sequences known for the protein family, the new methods essentially solve the long-standing problem of predicting the fold topology of monomeric proteins. Further, the number of sequences required in the alignment has fallen substantially. There is also substantial improvement in the accuracy of template-based models. Other areas—model refinement, accuracy estimation, and the structure of protein assemblies—have again yielded interesting results. CASP13 placed increased emphasis on the use of sparse data together with modeling and chemical crosslinking, SAXS, and NMR all yielded more mature results. This paper summarizes the key outcomes of CASP13. The special issue of PROTEINS contains papers describing the CASP13 assessments in each modeling category and contributions from the participants.  相似文献   

16.
Protein docking is essential for structural characterization of protein interactions. Besides providing the structure of protein complexes, modeling of proteins and their complexes is important for understanding the fundamental principles and specific aspects of protein interactions. The accuracy of protein modeling, in general, is still less than that of the experimental approaches. Thus, it is important to investigate the applicability of docking techniques to modeled proteins. We present new comprehensive benchmark sets of protein models for the development and validation of protein docking, as well as a systematic assessment of free and template-based docking techniques on these sets. As opposed to previous studies, the benchmark sets reflect the real case modeling/docking scenario where the accuracy of the models is assessed by the modeling procedure, without reference to the native structure (which would be unknown in practical applications). We also expanded the analysis to include docking of protein pairs where proteins have different structural accuracy. The results show that, in general, the template-based docking is less sensitive to the structural inaccuracies of the models than the free docking. The near-native docking poses generated by the template-based approach, typically, also have higher ranks than those produces by the free docking (although the free docking is indispensable in modeling the multiplicity of protein interactions in a crowded cellular environment). The results show that docking techniques are applicable to protein models in a broad range of modeling accuracy. The study provides clear guidelines for practical applications of docking to protein models.  相似文献   

17.
After decades of research, protein structure prediction remains a very challenging problem. In order to address the different levels of complexity of structural modeling, two types of modeling techniques--template-based modeling and template-free modeling--have been developed. Template-based modeling can often generate a moderate- to high-resolution model when a similar, homologous template structure is found for a query protein but fails if no template or only incorrect templates are found. Template-free modeling, such as fragment-based assembly, may generate models of moderate resolution for small proteins of low topological complexity. Seldom have the two techniques been integrated together to improve protein modeling. Here we develop a recursive protein modeling approach to selectively and collaboratively apply template-based and template-free modeling methods to model template-covered (i.e. certain) and template-free (i.e. uncertain) regions of a protein. A preliminary implementation of the approach was tested on a number of hard modeling cases during the 9th Critical Assessment of Techniques for Protein Structure Prediction (CASP9) and successfully improved the quality of modeling in most of these cases. Recursive modeling can significantly reduce the complexity of protein structure modeling and integrate template-based and template-free modeling to improve the quality and efficiency of protein structure prediction.  相似文献   

18.
Key to successful protein structure prediction is a potential that recognizes the native state from misfolded structures. Recent advances in empirical potentials based on known protein structures include improved reference states for assessing random interactions, sidechain-orientation-dependent pair potentials, potentials for describing secondary or supersecondary structural preferences and, most importantly, optimization protocols that sculpt the energy landscape to enhance the correlation between native-like features and the energy. Improved clustering algorithms that select native-like structures on the basis of cluster density also resulted in greater prediction accuracy. For template-based modeling, these advances allowed improvement in predicted structures relative to their initial template alignments over a wide range of target-template homology. This represents significant progress and suggests applications to proteome-scale structure prediction.  相似文献   

19.
Park H  Seok C 《Proteins》2012,80(8):1974-1986
Contemporary template-based modeling techniques allow applications of modeling methods to vast biological problems. However, they tend to fail to provide accurate structures for less-conserved local regions in sequence even when the overall structure can be modeled reliably. We call these regions unreliable local regions (ULRs). Accurate modeling of ULRs is of enormous value because they are frequently involved in functional specificity. In this article, we introduce a new method for modeling ULRs in template-based models by employing a sophisticated loop modeling technique. Combined with our previous study on protein termini, the method is applicable to refinement of both loop and terminus ULRs. A large-scale test carried out in a blind fashion in CASP9 (the 9th Critical Assessment of techniques for protein structure prediction) shows that ULR structures are improved over initial template-based models by refinement in more than 70% of the successfully detected ULRs. It is also notable that successful modeling of several long ULRs over 12 residues is achieved. Overall, the current results show that a careful application of loop and terminus modeling can be a promising tool for model refinement in template-based modeling.  相似文献   

20.
Most protein structure prediction methods use templates to assist in the construction of protein models. In this paper, we analyse the current state of template-based modelling approaches and reach an estimate of the empirical limits of these methods. Our analysis show that current prediction methods are already reaching these empirical accuracy limits in the easier cases, where finding a close homologue to the native target structure is not a problem. However, we find that even in the absence of alignment errors and using optimal templates, template-based methods have intrinsic limitations, suggesting that other methodologies, such as ab initio procedures, must be used if accuracy is ultimately to be improved.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号