首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
HIV-1 protease has a broad and complex substrate specificity. The discovery of an accurate, robust, and rapid method for predicting the cleavage sites in proteins by HIV protease would greatly expedite the search for inhibitors of HIV protease. During the last two decades, various methods have been developed to explore the specificity of HIV protease cleavage activity. However, because little advancement has been made in the understanding of HIV-1 protease cleavage site specificity, not much progress has been reported in either extracting effective methods or maintaining high prediction accuracy. In this article, a theoretical framework is developed, based on the kernel method for dimensionality reduction and prediction for HIV-1 protease cleavage site specificity. A nonlinear dimensionality reduction kernel method, based on manifold learning, is proposed to reduce the high dimensions of protease specificity. A support vector machine is applied to predict the protease cleavage. Superior performance in comparison to that previously published in literature is obtained using numerical simulations showing that the basic specificities of the HIV-1 protease are maintained in reduction feature space, and by combining the nonlinear dimensionality reduction algorithm with a support vector machine classifier.  相似文献   

3.
This paper presents a new neural learning algorithm for protease cleavage site prediction. The basic idea is to replace the radial basis function used in radial basis function neural networks by a so-called bio-basis function using amino acid similarity matrices. Mutual information is used to select bio-bases and a corresponding selection algorithm is developed. The algorithm has been applied to the prediction of HIV and Hepatitis C virus protease cleavage sites in proteins with success.  相似文献   

4.

Background

Proteases of human pathogens are becoming increasingly important drug targets, hence it is necessary to understand their substrate specificity and to interpret this knowledge in practically useful ways. New methods are being developed that produce large amounts of cleavage information for individual proteases and some have been applied to extract cleavage rules from data. However, the hitherto proposed methods for extracting rules have been neither easy to understand nor very accurate. To be practically useful, cleavage rules should be accurate, compact, and expressed in an easily understandable way.

Results

A new method is presented for producing cleavage rules for viral proteases with seemingly complex cleavage profiles. The method is based on orthogonal search-based rule extraction (OSRE) combined with spectral clustering. It is demonstrated on substrate data sets for human immunodeficiency virus type 1 (HIV-1) protease and hepatitis C (HCV) NS3/4A protease, showing excellent prediction performance for both HIV-1 cleavage and HCV NS3/4A cleavage, agreeing with observed HCV genotype differences. New cleavage rules (consensus sequences) are suggested for HIV-1 and HCV NS3/4A cleavages. The practical usability of the method is also demonstrated by using it to predict the location of an internal cleavage site in the HCV NS3 protease and to correct the location of a previously reported internal cleavage site in the HCV NS3 protease. The method is fast to converge and yields accurate rules, on par with previous results for HIV-1 protease and better than previous state-of-the-art for HCV NS3/4A protease. Moreover, the rules are fewer and simpler than previously obtained with rule extraction methods.

Conclusion

A rule extraction methodology by searching for multivariate low-order predicates yields results that significantly outperform existing rule bases on out-of-sample data, but are more transparent to expert users. The approach yields rules that are easy to use and useful for interpreting experimental data.  相似文献   

5.
基于二层特征筛选的HIV-1蛋白酶特异位点预测   总被引:1,自引:1,他引:0  
在抗艾滋病治疗中,HIV-1蛋白酶抑制剂发挥着重要作用。对于HIV-1蛋白酶裂解作用位点的研究有助于找到新的治疗靶点。为了对HIV-1蛋白酶特异位点进行预测,本研究用氨基酸索引数据库(Amino Acid Index,AAIndex)中的531个氨基酸物理化学性质参数直接表征肽样本的结构,通过二层特征筛选,最终将4248个表征参数降为57个表征参数。分别采取四种核函数进行HIV-1蛋白酶特异位点的支持向量机(SVM)建模,并通过10折交叉验证及外部测试集方法来验证建模的准确性。结果表明选取NormalizePolyKernel核函数进行SVM建模效果优于其他核函数(PolyKernel、PUK、RBFKernel),所建立的模型对于训练集的10组交叉验证预测准确率达到93.947%,对于外部测试集的预测正确率达到93.684%。  相似文献   

6.
Deciphering the knowledge of HIV protease specificity and developing computational tools for detecting its cleavage sites in protein polypeptide chain are very desirable for designing efficient and specific chemical inhibitors to prevent acquired immunodeficiency syndrome. In this study, we developed a generative model based on a generalization of variable order Markov chains (VOMC) for peptide sequences and adapted the model for prediction of their cleavability by certain proteases. The new method, called variable context Markov chains (VCMC), attempts to identify the context equivalence based on the evolutionary similarities between individual amino acids. It was applied for HIV-1 protease cleavage site prediction problem and shown to outperform existing methods in terms of prediction accuracy on a common dataset. In general, the method is a promising tool for prediction of cleavage sites of all proteases and encouraged to be used for any kind of peptide classification problem as well.  相似文献   

7.
MOTIVATION: In protein chemistry, proteomics and biopharmaceutical development, there is a desire to know not only where a protein is cleaved by a protease, but also the susceptibility of its cleavage sites. The current tools for proteolytic cleavage prediction have often relied purely on regular expressions, or involve models that do not represent biological data well. RESULTS: A novel methodology for characterizing proteolytic cleavage site activities has been developed, which incorporates two fundamental features: activity class prediction and the use of an amino acid similarity matrix for (non-parametric) neural learning. The first solved the problem of predicting proteolytic efficiency. The second significantly improved the robustness in prediction and reduced the time complexity for learning. This study shows that activity class prediction is successful when applying this methodology to the prediction and characterization of Trypsin cleavage sites and the prediction of HIV protease cleavage sites. AVAILABILITY: Requests for software and data should be made respectively to Dr Zheng Rong Yang and Miss Rebecca Thomson.  相似文献   

8.
Meller J  Elber R 《Proteins》2001,45(3):241-261
The design of scoring functions (or potentials) for threading, differentiating native-like from non-native structures with a limited computational cost, is an active field of research. We revisit two widely used families of threading potentials: the pairwise and profile models. To design optimal scoring functions we use linear programming (LP). The LP protocol makes it possible to measure the difficulty of a particular training set in conjunction with a specific form of the scoring function. Gapless threading demonstrates that pair potentials have larger prediction capacity compared with profile energies. However, alignments with gaps are easier to compute with profile potentials. We therefore search and propose a new profile model with comparable prediction capacity to contact potentials. A protocol to determine optimal energy parameters for gaps, using LP, is also presented. A statistical test, based on a combination of local and global Z-scores, is employed to filter out false-positives. Extensive tests of the new protocol are presented. The new model provides an efficient alternative for threading with pair energies, maintaining comparable accuracy. The code, databases, and a prediction server are available at http://www.tc.cornell.edu/CBIO/loopp.  相似文献   

9.
Proteases have central roles in "life and death" processes due to their important ability to catalytically hydrolyze protein substrates, usually altering the function and/or activity of the target in the process. Knowledge of the substrate specificity of a protease should, in theory, dramatically improve the ability to predict target protein substrates. However, experimental identification and characterization of protease substrates is often difficult and time-consuming. Thus solving the "substrate identification" problem is fundamental to both understanding protease biology and the development of therapeutics that target specific protease-regulated pathways. In this context, bioinformatic prediction of protease substrates may provide useful and experimentally testable information about novel potential cleavage sites in candidate substrates. In this article, we provide an overview of recent advances in developing bioinformatic approaches for predicting protease substrate cleavage sites and identifying novel putative substrates. We discuss the advantages and drawbacks of the current methods and detail how more accurate models can be built by deriving multiple sequence and structural features of substrates. We also provide some suggestions about how future studies might further improve the accuracy of protease substrate specificity prediction.  相似文献   

10.
The in vivo high‐throughput screening (HTS) of human immunodeficiency virus (HIV) protease inhibitors is a significant challenge because of the lack of reliable assays that allow the visualization of HIV targets within living cells. In this study, we developed a new molecular probe that utilizes the principles of Förster resonance energy transfer (FRET) to visualize HIV‐1 protease inhibition within living cells. The probe is constructed by linking two fluorescent proteins: AcGFP1 (a mutant green fluorescent protein) and mCherry (a red fluorescent protein) with an HIV‐1 protease cleavable p2/p7 peptide. The cleavage of the linker peptide by HIV‐1 protease leads to separation of AcGFP1 from mCherry, quenching FRET between AcGFP1 and mCherry. Conversely, the addition of a protease inhibitor prevents the cleavage of the linker peptide by the protease, allowing FRET from AcGFP1 to mCherry. Thus, HIV‐1 protease inhibition can be determined by measuring the FRET signal's change generated from the probe. Both in vitro and in vivo studies demonstrated the feasibility of applying the probe for quantitative analyses of HIV‐1 protease inhibition. By cotransfecting HIV‐1 protease and the probe expression plasmids into 293T cells, we showed that the inhibition of HIV‐1 protease by inhibitors can be visualized or quantitatively determined within living cells through ratiometric FRET microscopy imaging measurement. It is expected that this new probe will allow high‐content screening (HCS) of new anti‐HIV drugs. © 2011 American Institute of Chemical Engineers Biotechnol. Prog., 2011  相似文献   

11.
A sequence-coupled (Markov chain) model is proposed to predict the cleavage sites in proteins by proteases with extended specificity subsites. In addition to the probability of an amino acid occurring at each of these subsites as observed from a training set of oligopeptides known cleavable by HIV protease, the conditional probabilities as reflected by the neighbor-coupled effect along the subsite sequence are also taken into account. These conditional probabilities are derived from an expanded training set consisting of sufficiently large peptide sequences generated by the Monte Carlo sampling process. Very high accuracy was obtained in predicting protein cleavage sites by both HIV-1 and HIV-2 proteases. The new method provides a rapid and accurate means for analyzing the specificity of HIV protease, and hence can be used to help find effective inhibitors of HIV protease as potential drugs against AIDS. The principle of this method can also be used to study the specificity of any multisubsite enzyme.  相似文献   

12.
A protein-protein docking procedure traditionally consists in two successive tasks: a search algorithm generates a large number of candidate conformations mimicking the complex existing in vivo between two proteins, and a scoring function is used to rank them in order to extract a native-like one. We have already shown that using Voronoi constructions and a well chosen set of parameters, an accurate scoring function could be designed and optimized. However to be able to perform large-scale in silico exploration of the interactome, a near-native solution has to be found in the ten best-ranked solutions. This cannot yet be guaranteed by any of the existing scoring functions. In this work, we introduce a new procedure for conformation ranking. We previously developed a set of scoring functions where learning was performed using a genetic algorithm. These functions were used to assign a rank to each possible conformation. We now have a refined rank using different classifiers (decision trees, rules and support vector machines) in a collaborative filtering scheme. The scoring function newly obtained is evaluated using 10 fold cross-validation, and compared to the functions obtained using either genetic algorithms or collaborative filtering taken separately. This new approach was successfully applied to the CAPRI scoring ensembles. We show that for 10 targets out of 12, we are able to find a near-native conformation in the 10 best ranked solutions. Moreover, for 6 of them, the near-native conformation selected is of high accuracy. Finally, we show that this function dramatically enriches the 100 best-ranking conformations in near-native structures.  相似文献   

13.
Prediction of protein domain with mRMR feature selection and analysis   总被引:2,自引:0,他引:2  
Li BQ  Hu LL  Chen L  Feng KY  Cai YD  Chou KC 《PloS one》2012,7(6):e39308
The domains are the structural and functional units of proteins. With the avalanche of protein sequences generated in the postgenomic age, it is highly desired to develop effective methods for predicting the protein domains according to the sequences information alone, so as to facilitate the structure prediction of proteins and speed up their functional annotation. However, although many efforts have been made in this regard, prediction of protein domains from the sequence information still remains a challenging and elusive problem. Here, a new method was developed by combing the techniques of RF (random forest), mRMR (maximum relevance minimum redundancy), and IFS (incremental feature selection), as well as by incorporating the features of physicochemical and biochemical properties, sequence conservation, residual disorder, secondary structure, and solvent accessibility. The overall success rate achieved by the new method on an independent dataset was around 73%, which was about 28-40% higher than those by the existing method on the same benchmark dataset. Furthermore, it was revealed by an in-depth analysis that the features of evolution, codon diversity, electrostatic charge, and disorder played more important roles than the others in predicting protein domains, quite consistent with experimental observations. It is anticipated that the new method may become a high-throughput tool in annotating protein domains, or may, at the very least, play a complementary role to the existing domain prediction methods, and that the findings about the key features with high impacts to the domain prediction might provide useful insights or clues for further experimental investigations in this area. Finally, it has not escaped our notice that the current approach can also be utilized to study protein signal peptides, B-cell epitopes, HIV protease cleavage sites, among many other important topics in protein science and biomedicine.  相似文献   

14.
15.
16.
MOTIVATION: Apoptosis has drawn the attention of researchers because of its importance in treating some diseases through finding a proper way to block or slow down the apoptosis process. Having understood that caspase cleavage is the key to apoptosis, we find novel methods or algorithms are essential for studying the specificity of caspase cleavage activity and this helps the effective drug design. As bio-basis function neural networks have proven to outperform some conventional neural learning algorithms, there is a motivation, in this study, to investigate the application of bio-basis function neural networks for the prediction of caspase cleavage sites. RESULTS: Thirteen protein sequences with experimentally determined caspase cleavage sites were downloaded from NCBI. Bayesian bio-basis function neural networks are investigated and the comparisons with single-layer perceptrons, multilayer perceptrons, the original bio-basis function neural networks and support vector machines are given. The impact of the sliding window size used to generate sub-sequences for modelling on prediction accuracy is studied. The results show that the Bayesian bio-basis function neural network with two Gaussian distributions for model parameters (weights) performed the best and the highest prediction accuracy is 97.15 +/- 1.13%. AVAILABILITY: The package of Bayesian bio-basis function neural network can be obtained by request to the author.  相似文献   

17.
Knowledge of the polyprotein cleavage sites by HIV protease will refine our understanding of its specificity, and the information thus acquired will be useful for designing specific and efficient HIV protease inhibitors. The search for inhibitors of HIV protease will be greatly expedited if one can find and accurate, robust, and rapid method for predicting the cleavage sites in proteins by HIV protease. In this paper, Kohonen’s self-organization model, which uses typical artificial neural networks, is applied to predict the cleavability of oligopeptides by proteases with multiple and extended specificity subsites. We selected HIV-1 protease as the subject of study. We chose 299 oligopeptides for the training set, and another 63 oligopeptides for the test set. Because of its high rate of correct prediction (58/63=92.06%) and stronger fault-tolerant ability, the neural network method should be a useful technique for finding effective inhibitors of HIV protease, which is one of the targets in designing potential drugs against AIDS. The principle of the artificial neural network method can also be applied to analyzing the specificity of any multisubsite enzyme.  相似文献   

18.
Knowledge of the polyprotein cleavage sites by HIV protease will refine our understanding of its specificity, and the information thus acquired will be useful for designing specific and efficient HIV protease inhibitors. The search for inhibitors of HIV protease will be greatly expedited if one can find and accurate, robust, and rapid method for predicting the cleavage sites in proteins by HIV protease. In this paper, Kohonen’s self-organization model, which uses typical artificial neural networks, is applied to predict the cleavability of oligopeptides by proteases with multiple and extended specificity subsites. We selected HIV-1 protease as the subject of study. We chose 299 oligopeptides for the training set, and another 63 oligopeptides for the test set. Because of its high rate of correct prediction (58/63=92.06%) and stronger fault-tolerant ability, the neural network method should be a useful technique for finding effective inhibitors of HIV protease, which is one of the targets in designing potential drugs against AIDS. The principle of the artificial neural network method can also be applied to analyzing the specificity of any multisubsite enzyme.  相似文献   

19.
Bio-support vector machines for computational proteomics   总被引:2,自引:0,他引:2  
MOTIVATION: One of the most important issues in computational proteomics is to produce a prediction model for the classification or annotation of biological function of novel protein sequences. In order to improve the prediction accuracy, much attention has been paid to the improvement of the performance of the algorithms used, few is for solving the fundamental issue, namely, amino acid encoding as most existing pattern recognition algorithms are unable to recognize amino acids in protein sequences. Importantly, the most commonly used amino acid encoding method has the flaw that leads to large computational cost and recognition bias. RESULTS: By replacing kernel functions of support vector machines (SVMs) with amino acid similarity measurement matrices, we have modified SVMs, a new type of pattern recognition algorithm for analysing protein sequences, particularly for proteolytic cleavage site prediction. We refer to the modified SVMs as bio-support vector machine. When applied to the prediction of HIV protease cleavage sites, the new method has shown a remarkable advantage in reducing the model complexity and enhancing the model robustness.  相似文献   

20.
We describe a genetic system that allows in vivo screening or selection of site-specific proteases and of their cognate-specific inhibitors in Escherichia coli. This genetic test is based on the specific proteolysis of a signaling enzyme, the adenylate cyclase (AC) of Bordetella pertussis. As a model system we used the human immunodeficiency virus (HIV) protease. When an HIV protease processing site, p5, was inserted in frame into the AC polypeptide, the resulting ACp5 protein retained enzymatic activity and, when expressed in an E. coli cya strain, restored the Cya(+) phenotype. The HIV protease coexpressed in the same cells resulted in cleavage and inactivation of ACp5; the cells became Cya(-). When the entire HIV protease, including its adjacent processing sites, was inserted into the AC polypeptide, the resulting AC-HIV-Pr fusion protein, expressed in E. coli cya, was autoproteolysed and inactivated: the cells displayed Cya(-) phenotype. In the presence of the protease inhibitor indinavir or saquinavir, AC-HIV-Pr autoproteolysis was inhibited and the AC activity of the fusion protein was preserved; the cells were Cya(+). Protease variants resistant to particular inhibitors could be easily distinguished from the wild type, as the cells displayed a Cya(-) phenotype in the presence of these inhibitors. This genetic test could represent a powerful approach to screen for new proteolytic activities and for novel protease inhibitors. It could also be used to detect in patients undergoing highly active antiretroviral therapy the emergence of HIV variants harboring antiprotease-resistant proteases.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号