首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 121 毫秒
1.
通过分析3216条嗜热蛋白和4007条常温蛋白的二肽组成,结果发现,在嗜热蛋白中存在更多EE,EK,KE,VE,EI,KI,EV,KK,VK和IE等二肽,更少AA,LL,LA,AL,QA,QL,AQ,LT,TL和EQ等二肽。在此基础上发展了一种识别嗜热和常温蛋白的统计学方法,通过对两组共853个蛋白序列进行识别,该方法识别平均正确率分别可达89.0%和89.6%。同时探讨了一些特定二肽对识别效果的影响。  相似文献   

2.
采用Boosting机制的决策树集成分类器对嗜热和常温蛋白进行模式识别。通过自一致性检验、交叉验证和独立样本测试三种方法检测,其中作为Boosting算法中新的Logitboost算法表现更好,其识别的精度分别为100%、88.4%和89.5%,优于神经网络的识别效果。同时探讨了蛋白质分子大小对识别效果的影响。结果表明,将Boosting算法与其它单一分类器有效结合,有望提高研究者对生物分子相关特性的识别能力。  相似文献   

3.
嗜热与嗜常温微生物的蛋白质氨基酸组成比较   总被引:11,自引:0,他引:11  
嗜热微生物的嗜热特性与其蛋白质的高度热稳定性紧密相关。为了探索嗜热蛋白质的热稳定机制,比较嗜热和嗜常温微生物的蛋白质在氨基酸组成上的差别,收集110对分别来自嗜热和嗜常温微生物的同源蛋白质序列,比较两组蛋白质各种氨基酸含量以及疏水性氨基酸组成、疏水性指数和荷电氨基酸组成的差别,结果两者在多种氨基酸含量上存在微小但统计学上显著的差别,嗜热蛋白质比嗜常温蛋白质具有较高的平均疏水性和荷电氨基酸组成。对两组蛋白质的“脂肪族氨基酸指数”进行分析,证明嗜热蛋白质之所以具有较高的脂肪族氨基酸指数是由于其亮氨酸含量较高,与影响该指数的其它几种氨基酸无关;从而认为该指数的意义值得怀疑。通过对大量同源嗜热蛋白质和嗜常温蛋白质氨基酸组成的比较,能够揭示一些有关蛋白质热稳定性的普遍规律。  相似文献   

4.
启动子预测是研究基因转录调控的重要环节,但现有算法的预测正确率偏低.在深入分析启动子生物特征的基础上,提出了一种基于支持向量机的枯草杆菌启动子预测算法,在启动子序列的组成特征、信号特征和结构特征中选取9种典型特征作为预测的依据,对于信号特征,除了利用保守模式的一致序列,还考虑了间隔距离的分布信息.首先通过特征描述模型分别计算每种特征在启动子序列和非启动子序列中的得分,将特征得分组合成9维特征向量,再利用支持向量机在特征向量集上进行训练和判别.对实际数据集进行的刀切法测试验证了算法的有效性.对σ启动予的预测,平均正确率达到了90.7%;对几种其它σ因子启动子的预测,平均正确率也超过了80%.算法不但有广泛的适用性,还有良好的可扩展性,能够方便的容纳新特征,使识别性能不断提高.  相似文献   

5.
通过计数、分离与筛选,对常温环境嗜热菌和产嗜热蛋白酶菌的分布及资源状况进行了研究。结果表明,常温环境中存在着一定数量的嗜热菌和产嗜热蛋白酶菌。土壤与水体相比,其嗜热菌资源相对丰富,且耕作肥沃的土壤中产嗜热蛋白酶菌多于贫瘠土壤;在水环境中,无论湖水、江水还是处理中的废水,在常温条件下均有一定比例的嗜热菌和产嗜热蛋白酶菌。在啤酒废水曝气阶段,产嗜热蛋白酶菌占嗜热菌的比例较大,达45%。本研究筛选的1株嗜热菌其产嗜热蛋白酶活性较高,该菌株在pH7.6、温度68℃条件下其蛋白酶活力达到642U·ml^-1。该项研究为开发产嗜热蛋白酶菌资源,在工业和环境治理等方面的应用提供重要科学依据。  相似文献   

6.
文献报道采用氨基酸组成分布提取特征值能有效提高预测分类精度, 本文采用该方法提取特征值, 使用一种新的组合分类器——随机森林, 从蛋白质一级结构对嗜热和嗜冷蛋白进行分类。通过10倍交叉验证和独立样本测试两种方法检测, 结果表明:当分段数量为1时, 其精度最优, 分别为92.9%和90.2%, 暗示使用基于氨基酸组成分布提取特征值在该算法中并不能有效提高识别精度, 这与报道结果不符, 而该提取方法在SVM中却能适当提高识别精度; 当引入6个新变量后, 其精度分别提高到93.2%和92.2%, ROC曲线下面积分别为0.9771和0.9696, 优于其它组合分类器。  相似文献   

7.
文献报道采用氨基酸组成分布提取特征值能有效提高预测分类精度, 本文采用该方法提取特征值, 使用一种新的组合分类器——随机森林, 从蛋白质一级结构对嗜热和嗜冷蛋白进行分类。通过10倍交叉验证和独立样本测试两种方法检测, 结果表明:当分段数量为1时, 其精度最优, 分别为92.9%和90.2%, 暗示使用基于氨基酸组成分布提取特征值在该算法中并不能有效提高识别精度, 这与报道结果不符, 而该提取方法在SVM中却能适当提高识别精度; 当引入6个新变量后, 其精度分别提高到93.2%和92.2%, ROC曲线下面积分别为0.9771和0.9696, 优于其它组合分类器。  相似文献   

8.
利用分组重量编码预测细胞凋亡蛋白的亚细胞定位   总被引:2,自引:1,他引:1  
从氨基酸的物化特性出发,利用物理学中“粗粒化”和“分组”的思想,提出了一种新的蛋白质序列特征提取方法——分组重量编码方法。采用组分耦合算法作为分类器,从蛋白质一级序列出发对细胞凋亡蛋白的亚细胞定位进行研究。针对Zhou和Doctor使用的数据集,Re—substitution和Jackknife检验总体预测精度分别为98、O%和85.7%,比基于氨基酸组成和组分耦合算法的总体预测精度提高了7.2%和13.2%;针对陈颖丽和李前忠使用的数据集,Re—substitution和Jackknife检验总体预测精度分别为94.0%和80、1%,比基于二肽组成和离散增量算法的总体预测精度提高了5.9%和2、0%。针对我们自己整理的最新数据集,通过Re—substitution和Jackknife检验,总体预测精度分别为97.33%和75、11%。实验结果表明蛋白质序列的分组重量编码对于细胞凋亡蛋白的定位研究是一种有效的特征提取方法。  相似文献   

9.
腾冲嗜热厌氧杆菌tte0732(Galu)基因编码的TTE0732是温度依赖性蛋白。为研究其在热适应中的作用,应用PCR技术克隆腾冲嗜热厌氧菌tte0732基因,构建原核表达载体pET-28a::tte0732并在大肠埃希菌BL21表达TTE0732;通过qRT-PCR分析tte0732基因在50、60、75和80℃的RNA表达量;应用生物信息学软件分析Galu在嗜热菌和常温菌中编码氨基酸的基本理化性质。成功构建了原核表达载体pET-28a::tte0732并在大肠埃希菌BL21中得到高效表达,TTE0732分子质量大小为35 ku,主要以可溶性形式存在;qRT-PCR显示tte0732 mRNA在75和80℃高表达;生物信息学分析得出tte0732基因完整的ORF全长909 bp,编码302个氨基酸,其中Ile(I)、Leu(L)含量高于常温菌,编码蛋白为酸性亲水性蛋白,等电点为5.22,含有18个潜在的磷酸化位点,不存在跨膜结构、信号肽和糖基化位点。预测其蛋白质二级空间结构以α-螺旋、无规则卷曲、β-折叠为主。腾冲嗜热厌氧杆菌TTE0732蛋白是一种亲水性蛋白,在原核系统能高效表达,本研究结果对嗜热蛋白质的热稳定性机制的研究具有一定的参考。  相似文献   

10.
嗜热蛋白在高温下能保持稳定性和活性,是研究蛋白质热稳定性的理想模型,开发一个蛋白质热稳定性识别的方法将对蛋白质工程和蛋白质的设计很有帮助。目前的研究中,氨基酸的组成及其物化性质一直被认为和蛋白质的热稳定性相关。本研究筛选出可靠的数据集,包括915个嗜热蛋白和793个非嗜热蛋白。利用蛋白质氨基酸的物化性质和氨基酸的组成表征嗜热蛋白,将二肽氨基酸组成整合到9组氨基酸物化性质中使蛋白序列公式化。支持向量机5折叠交叉验证表明:当gap=0时,290个特征产生的精度最高,为92.74%。因此说明对于分析蛋白质的热稳定性,所建立的预测模型将是一个很有效的工具。  相似文献   

11.
Gromiha MM  Suresh MX 《Proteins》2008,70(4):1274-1279
Discriminating thermophilic proteins from their mesophilic counterparts is a challenging task and it would help to design stable proteins. In this work, we have systematically analyzed the amino acid compositions of 3075 mesophilic and 1609 thermophilic proteins belonging to 9 and 15 families, respectively. We found that the charged residues Lys, Arg, and Glu as well as the hydrophobic residues, Val and Ile have higher occurrence in thermophiles than mesophiles. Further, we have analyzed the performance of different methods, based on Bayes rules, logistic functions, neural networks, support vector machines, decision trees and so forth for discriminating mesophilic and thermophilic proteins. We found that most of the machine learning techniques discriminate these classes of proteins with similar accuracy. The neural network-based method could discriminate the thermophiles from mesophiles at the five-fold cross-validation accuracy of 89% in a dataset of 4684 proteins. Moreover, this method is tested with 325 mesophiles in Xylella fastidosa and 382 thermophiles in Aquifex aeolicus and it could successfully discriminate them with the accuracy of 91%. These accuracy levels are better than other methods in the literature and we suggest that this method could be effectively used to discriminate mesophilic and thermophilic proteins.  相似文献   

12.
The stability of thermophilic proteins has been viewed from different perspectives and there is yet no unified principle to understand this stability. It would be valuable to reveal the most important interactions for designing thermostable proteins for such applications as industrial protein engineering. In this work, we have systematically analyzed the importance of various interactions by computing different parameters such as surrounding hydrophobicity, inter‐residue interactions, ion‐pairs and hydrogen bonds. The importance of each interaction has been determined by its predicted relative contribution in thermophiles versus the same contribution in mesophilic homologues based on a dataset of 373 protein families. We predict that hydrophobic environment is the major factor for the stability of thermophilic proteins and found that 80% of thermophilic proteins analyzed showed higher hydrophobicity than their mesophilic counterparts. Ion pairs, hydrogen bonds, and interaction energy are also important and favored in 68%, 50%, and 62% of thermophilic proteins, respectively. Interestingly, thermophilic proteins with decreased hydrophobic environments display a greater number of hydrogen bonds and/or ion pairs. The systematic elimination of mesophilic proteins based on surrounding hydrophobicity, interaction energy, and ion pairs/hydrogen bonds, led to correctly identifying 95% of the thermophilic proteins in our analyses. Our analysis was also applied to another, more refined set of 102 thermophilic–mesophilic pairs, which again identified hydrophobicity as a dominant property in 71% of the thermophilic proteins. Further, the notion of surrounding hydrophobicity, which characterizes the hydrophobic behavior of residues in a protein environment, has been applied to the three‐dimensional structures of elongation factor‐Tu proteins and we found that the thermophilic proteins are enriched with a hydrophobic environment. The results obtained in this work highlight the importance of hydrophobicity as the dominating characteristic in the stability of thermophilic proteins, and we anticipate this will be useful in our attempts to engineering thermostable proteins. © Proteins 2013. © 2012 Wiley Periodicals, Inc.  相似文献   

13.
The identification of the thermostability from the amino acid sequence information would be helpful in computational screening for thermostable proteins. We have developed a method to discriminate thermophilic and mesophilic proteins based on support vector machines. Using self-consistency validation, 5-fold cross-validation and independent testing procedure with other datasets, this module achieved overall accuracy of 94.2%, 90.5% and 92.4%, respectively. The performance of this SVM-based module was better than the classifiers built using alternative machine learning and statistical algorithms including artificial neural networks, Bayesian statistics, and decision trees, when evaluated using these three validation methods. The influence of protein size on prediction accuracy was also addressed.  相似文献   

14.
Nakariyakul S  Liu ZP  Chen L 《Amino acids》2012,42(5):1947-1953
Detecting thermophilic proteins is an important task for designing stable protein engineering in interested temperatures. In this work, we develop a simple but efficient method to classify thermophilic proteins from mesophilic ones using the amino acid and dipeptide compositions. Since most of the amino acid and dipeptide compositions are redundant, we propose a new forward floating selection technique to select only a useful subset of these compositions as features for support vector machine-based classification. We test the proposed method on a benchmark data set of 915 thermophilic and 793 mesophilic proteins. The results show that our method using 28 amino acid and dipeptide compositions achieves an accuracy rate of 93.3% evaluated by the jackknife cross-validation test, which is higher not only than the existing methods but also than using all amino acid and dipeptide compositions.  相似文献   

15.
A database was designed to include 392 pairs of homologous proteins from thermophilic and mesophilic organisms. Proteins from thermophilic organisms proved to contain more atom-atom contacts per residue as compared with their mesophilic homologs. Solvent-accessible exterior amino acid residues contribute to the increase in the number of contacts. The amino acid composition was analyzed for internal (solvent-inaccessible) and exterior amino acid residues of thermophilic and mesophilic proteins. The exterior residues of thermophils have higher contents of Lys, Arg, and Glu and lower contents of Ala, Asp, Asn, Gln, Ser, and Thr as compared with mesophilic proteins. Interior protein regions did not differ in amino acid composition.  相似文献   

16.
A novel classifier, the so-called LogitBoost classifier, was introduced to discriminate the thermophilic and mesophilic proteins according to their primary structures. When the 20-amino acid composition was chosen as the feature vector, the overall accuracy of the self-consistency check and a five-fold cross-validation procedure was 97.0% and 86.6%, respectively. To test if the method was also applicable to a wide range of biological targets, an independent testing dataset was also used. The method based on LogitBoost algorithm has achieved an overall classification accuracy of 88.9%. According to the three different validation check approaches, it was demonstrated that LogitBoost outperformed AdaBoost and performed comparably with RBF neural network and support vector machine. The influence of protein size on discrimination was addressed.  相似文献   

17.
Identification of the characteristic structural patterns responsible for protein thermostability is theoretically important and practically useful but largely remains an open problem. These patterns may be revealed through comparative study on thermophilic and mesophilic proteins that have distinct thermostability. In this study, we constructed several distance-dependant potentials from thermophilic and mesophilic proteins. These potentials were then used to evaluate the structural difference between thermophilic and mesophilic proteins. We found that using the subtraction or division of the potentials derived from thermophilic and mesophilic proteins can dramatically increase the discriminatory ability. This approach revealed that the ability to distinct the subtle structural features responsible for protein thermostability may be effectively enhanced through rationally designed comparative study.  相似文献   

18.
One of the well-known observations of proteins from thermophilic bacteria is the bias of the amino acid composition in which charged residues are present in large numbers, and polar residues are scarce. On the other hand, it has been reported that the molecular surfaces of proteins are adapted to their subcellular locations, in terms of the amino acid composition. Thus, it would be reasonable to expect that the differences in the amino acid compositions between proteins of thermophilic and mesophilic bacteria would be much greater on the protein surface than in the interior. We performed systematic comparisons between proteins from thermophilic bacteria and mesophilic bacteria, in terms of the amino acid composition of the protein surface and the interior, as well as the entire amino acid chains, by using sequence information from the genome projects. The biased amino acid composition of thermophilic proteins was confirmed, and the differences from those of mesophilic proteins were most obvious in the compositions of the protein surface. In contrast to the surface composition, the interior composition was not distinctive between the thermophilic and mesophilic proteins. The frequency of the amino acid pairs that are closely located in the space was also analyzed to show the same trend of the single amino acid compositions. Interestingly, extracellular proteins from mesophilic bacteria showed an inverse trend against thermophilic proteins (i.e. a reduced number of charged residues and rich in polar residues). Nuclear proteins from eukaryotes, which are known to be abundant in positive charges, showed different compositions as a whole from the thermophiles. These results suggest that the bias of the amino acid composition of thermophilic proteins is due to the residues on the protein surfaces, which may be constrained by the extreme environment.  相似文献   

19.
This study compares the performance of anaerobic digestion of fruit and vegetable waste (FVW) in the thermophilic (55 °C) process with those under psychrophilic (20 °C) and mesophilic (35 °C) conditions in a tubular anaerobic digesters on a laboratory scale. The hydraulic retention time (HRT) ranged from 10 to 20 days, and raw fruit and vegetable waste was supplied in a semi-continuous mode at various concentrations of total solids (TS) (4, 6, 8 and 10% on dry weight). Biogas production from the experimental thermophilic digester was higher on average than from psychrophilic and mesophilic digesters by 144 and 41%, respectively. The net energy production in the thermophilic digester was 195.7 and 49.07 kJ per day higher than that for the psychrophilic and mesophilic digesters, respectively. The relation between the daily production of biogas and the temperature indicates that for the same produced quantity of biogas, the size of the thermophilic digester can be reduced with regard to that of the psychrophilic and the mesophilic digesters.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号