共查询到19条相似文献,搜索用时 78 毫秒
1.
蛋白质的二级结构预测研究进展 总被引:1,自引:0,他引:1
认识蛋白质的二级结构是了解蛋白质的折叠模式和三级结构的基础,并为研究蛋白质的功能以及它们之间的相互作用模式提供结构基础,同时还可以为新药研发提供帮助。故研究蛋白质的二级结构具有重要的意义。随着后基因组时代的到来,越来越多的蛋白质序列不断被发现,给蛋白质的二级结构研究带来巨大的挑战和研究空间。而依靠传统的实验方法很难获取大规模蛋白质的二级结构信息。目前,采用生物信息学手段仍然是获得大部分蛋白质二级结构的途径。近年来,许多研究者通过构建用于二级结构预测的蛋白质数据集,计算、提取蛋白质的各种特征信息,并采用不同的预测算法预测蛋白质的二级结构得到了快速的发展。本文拟从蛋白质的特征信息的提取与筛选、预测算法以及预测效果的检验方法等方面进行综述,介绍蛋白质二级结构预测领域的研究进展。相信随着基因组学、蛋白质组学和生物信息学的不断发展,蛋白质二级结构预测会不断取得新突破。 相似文献
2.
3.
4.
蛋白质二级结构预测对于我们了解蛋白质空间结构是至关重要的一步。文章提出了一种简单的二级结构预测方法,该方法采用多数投票法将现有的3种较好的二级结构预测方法的预测结果汇集形成一致性预测结果。从PDB数据库中随机选取近两年新测定结构的57条相似性小于30%的蛋白质,对该方法的预测结果进行测试,其Q3准确率比3种独立的方法提高了1.12—2.29%,相关系数及SOV准确率也有相应的提高。并且各项准确率均比同样采用一致性方法的Jpred二级结构预测程序准确率要高。这种预测方法虽然原理简单,但无须使用额外的参数,计算量小,易于实现,最重要的前提就是必须选用目前准确性比较出色的蛋白质二级结构预测方法。 相似文献
5.
P53蛋白质N末端的二级结构预测及其三维构象 总被引:2,自引:0,他引:2
以编码P53N末端120个残基的mRNA二级结构为基础,结合Chou-Fasman蛋白质二级结构预测原则,预测出P53蛋白质N端的93个残基包含四段α螺旋结构(14-26;38-46;51-56;68-70),没有发现β片层。与四种以多重序列联配为基础的蛋白质二级结构预测方法(准确率均为73.20%左右)相对照,结果十分相近。在SGI工作站上以此为初始结构建立的三维构象提示,P53N末端前80个氨基酸肽段呈弧型板块结构,其转录激活区由两段主要螺旋组成,呈上下构形,占据弧型板块的顶部及底部外侧缘。C端13个富含脯氨酸肽段则呈弯曲松散状。这些构象与P53N末端的生物功能是相吻合的 相似文献
6.
mRNA5‘端二级结构的预测法 总被引:2,自引:0,他引:2
研究mRNA 5′端二级结构有什么重要性呢 ?下面的例子可以说明这个问题 :如果在接近剪接位点的编码序列内有二级结构 ,就会影响基因的转录 ;在引物中有二级结构 (如发夹 )就会抑制PCR反应 ;如果在合成的基因序列的 5′末端有二级结构会干扰它在表达系统的翻译 ,降低基因表达产物的产量 ,这不仅提高了生产成本 ,还给生产工艺等带来极大麻烦。此外 ,RNA5′末端形成二级结构所需要的能量也影响基因产物的产量。所以 ,要想在基因表达系统中得到大量的基因产物就必须考虑这些问题 ,并对所克隆基因 5′末端的结构和它形成二级结构所需要… 相似文献
7.
补体蛋白质预测的二级结构与功能的关系 总被引:1,自引:0,他引:1
8.
9.
二级结构形成:蛋白质折叠起始过程的框架模型 总被引:8,自引:1,他引:7
框架模型认为二级结构形成是蛋白质起始过程的结构基础.文章介绍蛋白质同源片段的溶液构象及其构象研究法和多肽二级结构的从头设计,并综述这些研究成果应用于折叠起始过程的理论模型和蛋白质折叠起始过程的最新研究进展. 相似文献
10.
11.
12.
广义隐Markov模型(GHMM)是基因识别的一种重要模型,但是其计算量比传统的隐Markov模型大得多,以至于不能直 接在基因识别中使用。根据原核生物基因的结构特点,提出了一种高效的简化算法,其计算量是序列长度的线性函数。在此 基础上,构建了针对原核生物基因的识别程序GeneMiner,对实际数据的测试表明,此算法是有效的。 相似文献
13.
This paper examines recent developments and applications of Hidden Markov Models (HMMs) to various problems in computational biology, including multiple sequence alignment, homology detection, protein sequences classification, and genomic annotation. 相似文献
14.
基于隐马氏模型对编码序列缺失与插入的检测(英) 总被引:2,自引:0,他引:2
在基因组测序工作完成后,利用计算工具进行基因识别以及基因结构预测受到了越来越多人的重视.人们开发了大量的相关应用软件,如GenScan, Genemark, GRAIL等,这些软件在寻找新基因方面提供了很重要的线索.但基因的识别和预测问题仍未得到完全解决,当目标基因的编码序列有缺失和插入时,其预测结果和基因的实际结构相差很大.为了消除测序错误对预测结果的影响,希望能找出编码序列区的测序错误.基于这种想法,尝试根据DNA序列的一些统计特性,利用隐马尔科夫模型(Hidden Markov Model),引入缺失和插入状态,然后用Viterbi算法,从中找出含有缺失和插入的外显子序列片段.在常用的Burset/Guigo检测集进行检测,得到的结果在外显子水平上,Sn(sensitivity)和Sp(specificity)均达到84%以上. 相似文献
15.
Song YS 《Bulletin of mathematical biology》2006,68(2):361-384
In hidden Markov models, the probability of observing a set of strings can be computed using recursion relations. We construct a sufficient condition for simplifying the recursion relations for a certain class of hidden Markov models. If the condition is satisfied, then one can construct a reduced recursion where the dependence on Markov states completely disappears. We discuss a specific example—namely, statistical multiple alignment based on the TKF-model—in which the sufficient condition is satisfied. 相似文献
16.
MicroRNAs are one class of small single-stranded RNA of about 22 nt serving as important negative gene regulators. In animals,
miRNAs mainly repress protein translation by binding itself to the 3′ UTR regions of mRNAs with imperfect complementary pairing.
Although bioinformatics investigations have resulted in a number of target prediction tools, all of these have a common shortcoming—a
high false positive rate. Therefore, it is important to further filter the predicted targets. In this paper, based on miRNA:target
duplex, we construct a second-order Hidden Markov Model, implement Baum-Welch training algorithm and apply this model to further
process predicted targets. The model trains the classifier by 244 positive and 49 negative miRNA:target interaction pairs
and achieves a sensitivity of 72.54%, specificity of 55.10% and accuracy of 69.62% by 10-fold cross-validation experiments.
In order to further verify the applicability of the algorithm, previously collected datasets, including 195 positive and 38
negative, are chosen to test it, with consistent results. We believe that our method will provide some guidance for experimental
biologists, especially in choosing miRNA targets for validation. 相似文献
17.
隐半马氏模型在3′剪接位点识别中的应用(英) 总被引:1,自引:0,他引:1
新近的基因识别软件比先前的软件有着显著的提高,但是在外显子水平上的敏感性和特异性仍然不十分令人满意.这是因为已有软件对于剪接位点,翻译起始等生物信号位点的识别还不够有效.如果能够分别提高这些生物信号位点的识别效果,就能够提高整体的基因识别效率.隐半马氏模型能够很好地刻画3′剪接位点(acceptor)的结构.据此开发的一套对acceptor进行识别的算法在Burset/Guigo的数据集上经过检验,获得了比已有算法更好的识别率.该模型的成功还使得我们对剪接点上游的分支位点和嘧啶富含区的概貌有了一定的认识,加深了人们对于acceptor的结构和剪接过程的理解. 相似文献
18.
Three different strategies to tackle mispredictions from incorrect secondary structure prediction are analysed using 21 small proteins (22-121 amino acids; 1-6 secondary structure elements) with known three dimensional structures: (1) Testing accuracy of different secondary structure predictions and improving them by combinations, (2) correcting mispredictions exploiting protein folding simulations with a genetic algorithm and (3) applying and combining experimental data to refine predictions both for secondary structure and tertiary fold. We demonstrate that predictions from secondary structure prediction programs can be efficiently combined to reduce prediction errors from missed secondary structure elements. Further, up to two secondary structure elements (helices, strands) missed by secondary structure prediction were corrected by the genetic algorithm simulation. Finally, we show how input from experimental data is exploited to refine the predictions obtained.Electronic Supplementary Material available. 相似文献
19.
A homology identification method that combines protein sequence and structure information. 总被引:3,自引:2,他引:1 下载免费PDF全文
L. Yu J. V. White T. F. Smith 《Protein science : a publication of the Protein Society》1998,7(12):2499-2510
A new method is presented for identifying distantly related homologous proteins that are unrecognizable by conventional sequence comparison methods. The method combines information about functionally conserved sequence patterns with information about structure context. This information is encoded in stochastic discrete state-space models (DSMs) that comprise a new family of hidden Markov models. The new models are called sequence-pattern-embedded DSMs (pDSMs). This method can identify distantly related protein family members with a high sensitivity and specificity. The method is illustrated with trypsin-like serine proteases and globins. The strategy for building pDSMs is presented. The method has been validated using carefully constructed positive and negative control sets. In addition to the ability to recognize remote homologs, pDSM sequence analysis predicts secondary structures with higher sensitivity, specificity, and Q3 accuracy than DSM analysis, which omits information about conserved sequence patterns. The identification of trypsin-like serine proteases in new genomes is discussed. 相似文献