首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 78 毫秒
1.
蛋白质折叠类型分类方法及分类数据库   总被引:1,自引:0,他引:1  
李晓琴  仁文科  刘岳  徐海松  乔辉 《生物信息学》2010,8(3):245-247,253
蛋白质折叠规律研究是生命科学重大前沿课题,折叠分类是蛋白质折叠研究的基础。目前的蛋白质折叠类型分类基本上靠专家完成,不同的库分类并不相同,迫切需要一个建立在统一原理基础上的蛋白质折叠类型数据库。本文以ASTRAL-1.65数据库中序列同源性在25%以下、分辨率小于2.5的蛋白为基础,通过对蛋白质空间结构的观察及折叠类型特征的分析,提出以蛋白质折叠核心为中心、以蛋白质结构拓扑不变性为原则、以蛋白质折叠核心的规则结构片段组成、连接和空间排布为依据的蛋白质折叠类型分类方法,建立了低相似度蛋白质折叠分类数据库——LIFCA,包含259种蛋白质折叠类型。数据库的建立,将为进一步的蛋白质折叠建模及数据挖掘、蛋白质折叠识别、蛋白质折叠结构进化研究奠定基础。  相似文献   

2.
蛋白质折叠规律研究是生命科学重大前沿课题,折叠类型分类是蛋白质折叠研究的基础。构建BRD-like折叠类型模板数据库,建立了基于多模板的综合分类方法,并用于该折叠类型的分类。对实验集的12 117个样本进行检验,结果的敏感性、特异性分别为0.923和0.997,MCC值为0.72;对独立检验集2 260个样本的检验,结果发现:敏感性、特异性分别为0.941和0.998,MCC值为0.86.结果表明:基于多模板的综合分类方法可用于蛋白质折叠类型分类。  相似文献   

3.
蛋白质折叠类型分类是蛋白质分类研究的重要内容。以SCOP数据库中的 PH domain-like barrel 折叠类型为研究对象,选择序列相似度小于25%的61个样本为检验集,通过结构特征分析,确定了该折叠类型的模板及其对应的特征参数,利用模板与待测蛋白的空间结构比对信息,提出了一个新的折叠类型打分函数Fscore,建立了基于Fscore的蛋白质折叠类型分类方法并用于该折叠类型的分类。用此方法对Astral1.75中序列相似度小于95%的16711个样本进行检验,分类结果的特异性为99.97%。结果表明:特征参数抓住了折叠类型的本质,打分函数Fscore及基于Fscore建立的分类方法可用于 PH domain-like barrel 蛋白质折叠类型自动分类。  相似文献   

4.
蛋白质空间结构研究是分子生物学、细胞生物学、生物化学以及药物设计等领域的重要课题.折叠类型反映了蛋白质核心结构的拓扑模式,对折叠类型的识别是蛋白质序列与结构关系研究的重要内容.选取LIFCA数据库中样本量较大的53种折叠类型,应用功能域组分方法进行折叠识别.将Astral 1.65中序列一致性小于95%的样本作为检验集,全库检验结果中平均敏感性为96.42%,特异性为99.91%,马修相关系数(MCC)为0.91,各项统计结果表明:功能域组分方法可以很好地应用在蛋白质折叠识别中,LIFCA相对简单的分类规则可以很好地集中蛋白质的大部分功能特性,反映了结构与功能的对应关系.  相似文献   

5.
α/β类蛋白质折叠类型的分类方法研究   总被引:1,自引:0,他引:1       下载免费PDF全文
马帅  王勤  李晓琴 《生物信息学》2014,12(2):123-132
蛋白质折叠规律的研究是生命科学重大前沿课题之一,折叠分类是蛋白质折叠研究的基础。本文基于LIFCA数据库,选取样本量大于2的55种α/β类蛋白质折叠类型为研究对象。结合蛋白质折叠类型的定义及其保守拓扑结构特征,确定了55种蛋白质折叠类型的模板及其对应的特征参数。建立了基于模板的打分函数Mul-Fscore,并结合二级结构参数信息,给出了55种α/β类蛋白质折叠类型的多模板分类方法。用此方法对LIFAC数据库中的931个样本进行检验,分类结果的平均特异性、平均敏感性、MCC值分别为99.58%、79.47%、79.39%。与TM-score分类结果对比发现,Mul-Fscore分类的敏感性与MCC值好于TM-score的相应结果,平均特异性相近。  相似文献   

6.
对用于折叠模式识别的蛋白质结构数据库进行结构分类,构建了四个分类库:Al-α库,Al-β库,α/β库,α+β库和一个总库,然后分别统计出不同灵敏度的匹配评估函数(平均势)。对不同的平均势,不同结构类型的蛋白进行的检验发现:来源于α/β库的平均势预测能力最强,来源于Al-α库的平均势预测能力最弱;对α/β蛋白的预测成功率最高,对Al-α蛋白的预测成功率最低。这与α/β蛋白结构最规则,Al-α蛋白未加入辅基不能反映出结构的全部特征是相一致的  相似文献   

7.
《生命科学研究》2016,(5):381-388
蛋白质折叠类型识别是蛋白质结构研究的重要内容,折叠类型分类是折叠识别的基础。通过对ASTRAL-1.65数据库α类蛋白质所属折叠类型进行系统研究,建立蛋白质折叠类型模板数据库,提取反映折叠类型拓扑结构的模板特征参数,根据模板特征参数和TM-align结构比对结果,建立基于特征参数的打分函数Fdscore,并实现α类蛋白质折叠类型自动化分类。使用相同数据集样本,将Fdscore分类方法与TM-score分类方法对比,Fdscore分类方法的平均敏感性、平均特异性、MCC值分别为71.86%、99.49%、0.69,均高于TM-score分类方法相对应结果。上述结果表明该分类方法可用于α类蛋白质折叠类型的自动化分类。  相似文献   

8.
蛋白质折叠速率预测研究进展   总被引:2,自引:0,他引:2  
蛋白质折叠速率预测是当今生物物理学最具挑战性的课题之一。近年来,该领域的研究取得了很大的进展,提出了许多经验参数,例如:接触序、长程序、总接触距离、链拓扑参数、二级结构含量、有效长度、螺旋参数、n-阶接触距离等。这些参数都和蛋白质的折叠速率有很好的相关性,基于这些参数的各种预测方法所得到的预测结果也与实验数据较好地吻合。  相似文献   

9.
蛋白质的折叠   总被引:2,自引:0,他引:2  
重点介绍了蛋白质折叠的热力学控制学说和动力学控制学说,简单介绍了几种蛋白质折叠模型并分析了多肽链在体内进行快速折叠的原因。  相似文献   

10.
蛋白质折叠类型识别方法研究   总被引:1,自引:0,他引:1  
蛋白质折叠类型识别是一种分析蛋白质结构的重要方法.以序列相似性低于25%的822个全B类蛋白为研究对象,提取核心结构二级结构片段及片段问氢键作用信息为折叠类型特征参数,构建全B类蛋白74种折叠类型模板数据库.定义查询蛋白与折叠类型模板间二级结构匹配函数SS、氢键作用势函数BP及打分函数P,P值最小的模板所对应的折叠类型为查询蛋白的折叠类型.从SCOP1.69中随机抽取三组、每组50个全β类蛋白结构域进行预测,分辨精度分别为56%、56%和42%;对Ding等提供的检验集进行预测,总分辨精度为61.5%.结果和比对表明,此方法是一种有效的折叠类型识别方法.  相似文献   

11.
Insights into protein folding rely increasingly on the synergy between experimental and theoretical approaches. Developing successful computational models requires access to experimental data of sufficient quantity and high quality. We compiled folding rate constants for what initially appeared to be 184 proteins from 15 published collections/web databases. To generate the highest confidence in the dataset, we verified the reported lnkf value and exact experimental construct and conditions from the original experimental report(s). The resulting comprehensive database of 126 verified entries, ACPro, will serve as a freely accessible resource ( https://www.ats.amherst.edu/protein/ ) for the protein folding community to enable confident testing of predictive models. In addition, we provide a streamlined submission form for researchers to add new folding kinetics results, requiring specification of all the relevant experimental information according to the standards proposed in 2005 by the protein folding consortium organized by Plaxco. As the number and diversity of proteins whose folding kinetics are studied expands, our curated database will enable efficient and confident incorporation of new experimental results into a standardized collection. This database will support a more robust symbiosis between experiment and theory, leading ultimately to more rapid and accurate insights into protein folding, stability, and dynamics.  相似文献   

12.
The availability of fast and robust algorithms for protein structure comparison provides an opportunity to produce a database of three-dimensional comparisons, called families of structurally similar proteins (FSSP). The database currently contains an extended structural family for each of 154 representative (below 30% sequence identity) protein chains. Each data set contains: the search structure; all its relatives with 70-30% sequence identity, aligned structurally; and all other proteins from the representative set that contain substructures significantly similar to the search structure. Very close relatives (above 70% sequence identity) rarely have significant structural differences and are excluded. The alignments of remote relatives are the result of pairwise all-against-all structural comparisons in the set of 154 representative protein chains. The comparisons were carried out with each of three novel automatic algorithms that cover different aspects of protein structure similarity. The user of the database has the choice between strict rigid-body comparisons and comparisons that take into account interdomain motion or geometrical distortions; and, between comparisons that require strictly sequential ordering of segments and comparisons, which allow altered topology of loop connections or chain reversals. The data sets report the structurally equivalent residues in the form of a multiple alignment and as a list of matching fragments to facilitate inspection by three-dimensional graphics. If substructures are ignored, the result is a database of structure alignments of full-length proteins, including those in the twilight zone of sequence similarity.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

13.
Tobi D 《Proteins》2012,80(4):1167-1176
A novel methodology for comparison of protein dynamics is presented. Protein dynamics is calculated using the Gaussian network model and the modes of motion are globally aligned using the dynamic programming algorithm of Needleman and Wunsch, commonly used for sequence alignment. The alignment is fast and can be used to analyze large sets of proteins. The methodology is applied to the four major classes of the SCOP database: "all alpha proteins," "all beta proteins," "alpha and beta proteins," and "alpha/beta proteins". We show that different domains may have similar global dynamics. In addition, we report that the dynamics of "all alpha proteins" domains are less specific to structural variations within a given fold or superfamily compared with the other classes. We report that domain pairs with the most similar and the least similar global dynamics tend to be of similar length. The significance of the methodology is that it suggests a new and efficient way of mapping between the global structural features of protein families/subfamilies and their encoded dynamics.  相似文献   

14.
It is hard to construct theories for the folding of globular proteins because they are large and complicated molecules having enormous numbers of nonnative conformations and having native states that are complicated to describe. Statistical mechanical theories of protein folding are constructed around major simplifying assumptions about the energy as a function of conformation and/or simplifications of the representation of the polypeptide chain, such as one point per residue on a cubic lattice. It is not clear how the results of these theories are affected by their various simplifications. Here we take a very different simplification approach where the chain is accurately represented and the energy of each conformation is calculated by a not unreasonable empirical function. However, the set of amino acid sequences and allowed conformations is so restricted that it becomes computationally feasible to examine them all. Hence we are able to calculate melting curves for thermal denaturation as well as the detailed kinetic pathway of refolding. Such calculations are based on a novel representation of the conformations as points in an abstract 12-dimensional Euclidean conformation space. Fast folding sequences have relatively high melting temperatures, native structures with relatively low energies, small kinetic barriers between local minima, and relatively many conformations in the global energy minimum's watershed. In contrast to other folding theories, these models show no necessary relationship between fast folding and an overall funnel shape to the energy surface, or a large energy gap between the native and the lowest nonnative structure, or the depth of the native energy minimum compared to the roughness of the energy landscape. Proteins 32:425–437, 1998. © 1998 Wiley-Liss, Inc.  相似文献   

15.
Nick V. Grishin 《Proteins》2015,83(7):1238-1251
ECOD (Evolutionary Classification Of protein Domains) is a comprehensive and up‐to‐date protein structure classification database. The majority of new structures released from the PDB (Protein Data Bank) each week already have close homologs in the ECOD hierarchy and thus can be reliably partitioned into domains and classified by software without manual intervention. However, those proteins that lack confidently detectable homologs require careful analysis by experts. Although many bioinformatics resources rely on expert curation to some degree, specific examples of how this curation occurs and in what cases it is necessary are not always described. Here, we illustrate the manual classification strategy in ECOD by example, focusing on two major issues in protein classification: domain partitioning and the relationship between homology and similarity scores. Most examples show recently released and manually classified PDB structures. We discuss multi‐domain proteins, discordance between sequence and structural similarities, difficulties with assessing homology with scores, and integral membrane proteins homologous to soluble proteins. By timely assimilation of newly available structures into its hierarchy, ECOD strives to provide a most accurate and updated view of the protein structure world as a result of combined computational and expert‐driven analysis. Proteins 2015; 83:1238–1251. © 2015 Wiley Periodicals, Inc.  相似文献   

16.
An elementary step in the assembly of adhesive type 1 pili of Escherichia coli is the folding of structural pilus subunits in the periplasm. The previously determined X-ray structure of the complex between the type 1 pilus adhesin FimH and the periplasmic pilus assembly chaperone FimC has shown that FimH consists of a N-terminal lectin domain and a C-terminal pilin domain, and that FimC exclusively interacts with the pilin domain. The pilin domain fold, which is common to all pilus subunits, is characterized by an incomplete beta-sheet that is completed by a donor strand from FimC in the FimC-FimH complex. This, together with unsuccessful attempts to refold isolated, urea-denatured FimH in vitro had suggested that folding of pilin domains strictly depends on sequence information provided by FimC. We have now analyzed in detail the folding of FimH and its two isolated domains in vitro. We find that not only the lectin domain, but also the pilin domain can fold autonomously and independently of FimC. However, the thermodynamic stability of the pilin domain is very low (8-10kJmol(-1)) so that a significant fraction of the domain is unfolded even in the absence of denaturant. This explains the high tendency of structural pilus subunits to aggregate non-specifically in the absence of stoichiometric amounts of FimC. Thus, pilus chaperones prevent non-specific aggregation of pilus subunits by native state stabilization after subunit folding.  相似文献   

17.
Understanding, and ultimately predicting, how a 1-D protein chain reaches its native 3-D fold has been one of the most challenging problems during the last few decades. Data increasingly indicate that protein folding is a hierarchical process. Hence, the question arises as to whether we can use the hierarchical concept to reduce the practically intractable computational times. For such a scheme to work, the first step is to cut the protein sequence into fragments that form local minima on the polypeptide chain. The conformations of such fragments in solution are likely to be similar to those when the fragments are embedded in the native fold, although alternate conformations may be favored during the mutual stabilization in the combinatorial assembly process. Two elements are needed for such cutting: (1) a library of (clustered) fragments derived from known protein structures and (2) an assignment algorithm that selects optimal combinations to "cover" the protein sequence. The next two steps in hierarchical folding schemes, not addressed here, are the combinatorial assembly of the fragments and finally, optimization of the obtained conformations. Here, we address the first step in a hierarchical protein-folding scheme. The input is a target protein sequence and a library of fragments created by clustering building blocks that were generated by cutting all protein structures. The output is a set of cutout fragments. We briefly outline a graph theoretic algorithm that automatically assigns building blocks to the target sequence, and we describe a sample of the results we have obtained.  相似文献   

18.
Temperature-jump NMR study of protein folding: Ribonuclease A at low pH   总被引:3,自引:0,他引:3  
Summary The kinetic process of folding of bovine pancreatic ribonuclease A in a2H2O environment at pH 1.2 was examined by a recently developed temperature-jump NMR method (Akasaka et al., (1990) Rev. Sci. Instrum.61, 66–68). Upon temperature-jump down from 45°C to 29°C, which was attained within 6 s, the proton NMR spectral changes were followed consecutively in time intervals of seconds. There was a rapid spectral change, which was finished within the jump period, followed by a much slower process which lasted for a minute or longer. Rates of the slower process were measured at different positions of the polypeptide chain as intensity changes of individual His and Tyr proton signals of the folded conformer and as intensity changes of aliphatic and His protons of the unfolded conformer. Most of these rates coincided with each other within experimental error with an average value of 2.8×10–2s–1. The result gave clear experimental evidence that the slow folding of RNase A at low pH is a cooperative process involving most regions of the molecule, not only thermodynamically, but kinetically as well.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号