首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Li Y  Wen Z  Zhou C  Tan F  Li M 《Peptides》2008,29(9):1498-1504
Signal peptide has a pivotal role in the translocation of secretory protein. Some models have been designed to predict its cleavage site. It is reported that the cleavage site has relationship with the neighboring sequence environment, i.e., hydrophobic core h-region, and the specific patterns in c-region. In some studies, this finding does facilitate the prediction of cleavage site. However, in these models, sequence environment information is merely taken account of as model inputs and no detailed investigation into its effect on the prediction of cleavage site has been made. In this work, we analyze the constraint on cleave site placed by the hydrophobic core of signal peptide and then use it to improve the performance of the signal peptide cleavage site prediction. Our model is designed as follows: firstly, a sliding window is used to scan sample and artificial neural network (ANN) is employed to give cleavage site/non-cleavage site scores. Then, based on an estimated hydrophobic h-region a correcting function is proposed to improve the prediction result, in which the sequence environment is taken into account. A trend of cleavage site is indicated by our analysis for each position, which is consistent with experimental findings. Through this correcting step, the improvement of prediction accuracy is over 7%. It therefore demonstrates the neighboring sequence environment is helpful for determination of cleavage site. Program written in Matlab can be downloaded from http://www.scucic.cn/combined model/source code.html.  相似文献   

2.
Knowledge of the polyprotein cleavage sites by HIV protease will refine our understanding of its specificity, and the information thus acquired will be useful for designing specific and efficient HIV protease inhibitors. The search for inhibitors of HIV protease will be greatly expedited if one can find and accurate, robust, and rapid method for predicting the cleavage sites in proteins by HIV protease. In this paper, Kohonen’s self-organization model, which uses typical artificial neural networks, is applied to predict the cleavability of oligopeptides by proteases with multiple and extended specificity subsites. We selected HIV-1 protease as the subject of study. We chose 299 oligopeptides for the training set, and another 63 oligopeptides for the test set. Because of its high rate of correct prediction (58/63=92.06%) and stronger fault-tolerant ability, the neural network method should be a useful technique for finding effective inhibitors of HIV protease, which is one of the targets in designing potential drugs against AIDS. The principle of the artificial neural network method can also be applied to analyzing the specificity of any multisubsite enzyme.  相似文献   

3.
Knowledge of the polyprotein cleavage sites by HIV protease will refine our understanding of its specificity, and the information thus acquired will be useful for designing specific and efficient HIV protease inhibitors. The search for inhibitors of HIV protease will be greatly expedited if one can find and accurate, robust, and rapid method for predicting the cleavage sites in proteins by HIV protease. In this paper, Kohonen’s self-organization model, which uses typical artificial neural networks, is applied to predict the cleavability of oligopeptides by proteases with multiple and extended specificity subsites. We selected HIV-1 protease as the subject of study. We chose 299 oligopeptides for the training set, and another 63 oligopeptides for the test set. Because of its high rate of correct prediction (58/63=92.06%) and stronger fault-tolerant ability, the neural network method should be a useful technique for finding effective inhibitors of HIV protease, which is one of the targets in designing potential drugs against AIDS. The principle of the artificial neural network method can also be applied to analyzing the specificity of any multisubsite enzyme.  相似文献   

4.
We present a neural network based method (ChloroP) for identifying chloroplast transit peptides and their cleavage sites. Using cross-validation, 88% of the sequences in our homology reduced training set were correctly classified as transit peptides or nontransit peptides. This performance level is well above that of the publicly available chloroplast localization predictor PSORT. Cleavage sites are predicted using a scoring matrix derived by an automatic motif-finding algorithm. Approximately 60% of the known cleavage sites in our sequence collection were predicted to within +/-2 residues from the cleavage sites given in SWISS-PROT. An analysis of 715 Arabidopsis thaliana sequences from SWISS-PROT suggests that the ChloroP method should be useful for the identification of putative transit peptides in genome-wide sequence data. The ChloroP predictor is available as a web-server at http://www.cbs.dtu.dk/services/ChloroP/.  相似文献   

5.
BACKGROUND: The ability to predict the native conformation of a globular protein from its amino-acid sequence is an important unsolved problem of molecular biology. We have previously reported a method in which reduced representations of proteins are folded on a lattice by Monte Carlo simulation, using statistically-derived potentials. When applied to sequences designed to fold into four-helix bundles, this method generated predicted conformations closely resembling the real ones. RESULTS: We now report a hierarchical approach to protein-structure prediction, in which two cycles of the above-mentioned lattice method (the second on a finer lattice) are followed by a full-atom molecular dynamics simulation. The end product of the simulations is thus a full-atom representation of the predicted structure. The application of this procedure to the 60 residue, B domain of staphylococcal protein A predicts a three-helix bundle with a backbone root mean square (rms) deviation of 2.25-3 A from the experimentally determined structure. Further application to a designed, 120 residue monomeric protein, mROP, based on the dimeric ROP protein of Escherichia coli, predicts a left turning, four-helix bundle native state. Although the ultimate assessment of the quality of this prediction awaits the experimental determination of the mROP structure, a comparison of this structure with the set of equivalent residues in the ROP dime- crystal structure indicates that they have a rms deviation of approximately 3.6-4.2 A. CONCLUSION: Thus, for a set of helical proteins that have simple native topologies, the native folds of the proteins can be predicted with reasonable accuracy from their sequences alone. Our approach suggest a direction for future work addressing the protein-folding problem.  相似文献   

6.
Liu DQ  Liu H  Shen HB  Yang J  Chou KC 《Amino acids》2007,32(4):493-496
Summary. A newly synthesized secretory protein in cells bears a special sequence, called signal peptide or sequence, which plays the role of “address tag” in guiding the protein to wherever it is needed. Such a unique function of signal sequences has stimulated novel strategies for drug design or reprogramming cells for gene therapy. To realize these new ideas and plans, however, it is important to develop an automated method for fast and accurately identifying the signal sequences or their cleavage sites. In this paper, a new method is developed for predicting the signal sequence of a query secretory protein by fusing the results from a series of global alignments through a voting system. The very high success rates thus obtained suggest that the novel approach is very promising, and that the new method may become a useful vehicle in identifying signal sequence, or at least serve as a complementary tool to the existing algorithms of this field.  相似文献   

7.
In this paper, we present methods to detect and localize patternsin biologically related protein sequences (family). The patternscommon to the sequences of the family are detected by usingFourier analysis. No previous scales (codes) are needed, theyare actually produced as a result of the analysis procedure,together with the frequencies of the Fourier decompositions.Characteristic features of the family are thus expressed as(code–frequency) pairs. Various tools are proposed inorder to localize the patterns, to compare the codes, and toevaluate the proximity of an arbitrary sequence to the investigatedfamily. The general strategy is illustrated on a family composedof proteins Received on October 17, 1989; accepted on January 16, 1990  相似文献   

8.
We have developed a new method for the identification of signal peptides and their cleavage sites based on neural networks trained on separate sets of prokaryotic and eukaryotic sequences. The method performs significantly better than previous prediction schemes, and can easily be applied to genome-wide data sets. Discrimination between cleaved signal peptides and uncleaved N-terminal signal-anchor sequences is also possible, though with lower precision. Predictions can be made on a publicly available WWW server: http://www.cbs.dtu.dk/services/SignalP/.  相似文献   

9.
Understanding energetics and mechanism of protein-protein association remains one of the biggest theoretical problems in structural biology. It is assumed that desolvation must play an essential role during the association process, and indeed protein-protein interfaces in obligate complexes have been found to be highly hydrophobic. However, the identification of protein interaction sites from surface analysis of proteins involved in non-obligate protein-protein complexes is more challenging. Here we present Optimal Docking Area (ODA), a new fast and accurate method of analyzing a protein surface in search of areas with favorable energy change when buried upon protein-protein association. The method identifies continuous surface patches with optimal docking desolvation energy based on atomic solvation parameters adjusted for protein-protein docking. The procedure has been validated on the unbound structures of a total of 66 non-homologous proteins involved in non-obligate protein-protein hetero-complexes of known structure. Optimal docking areas with significant low-docking surface energy were found in around half of the proteins. The 'ODA hot spots' detected in X-ray unbound structures were correctly located in the known protein-protein binding sites in 80% of the cases. The role of these low-surface-energy areas during complex formation is discussed. Burial of these regions during protein-protein association may favor the complexed configurations with near-native interfaces but otherwise arbitrary orientations, thus driving the formation of an encounter complex. The patch prediction procedure is freely accessible at http://www.molsoft.com/oda and can be easily scaled up for predictions in structural proteomics.  相似文献   

10.
The structure of an oligodeoxyribonucleotide may be determined by a simple two-dimensional separation on a polyethyleneimine-cellulose thin layer sheet. Chromatography in the first dimension fractionates by chain length a nested set of fragments that are generated by subjecting the oligomer to partial spleen phosphodiesterase degradation and then labelling their non-common ends with 32P using polynucleotide kinase. A subsequent in situ treatment with nuclease Bal 31 produces labelled mononucleotides, and these are identified by chromatography in the second dimension. Since the method does not identify the 3' terminal nucleotide, a convenient procedure involving 3' end labelling followed by enzymatic digestion to monomers has been developed for this purpose. This approach to sequence analysis also has the advantage of permitting assignment of the identity and location of any modified or unusual bases within the oligonucleotide.  相似文献   

11.
Computational methods for predicting protein-protein interaction sites based on structural data are characterized by an accuracy between 70 and 80%. Some experimental studies indicate that only a fraction of the residues, forming clusters in the center of the interaction site, are energetically important for binding. In addition, the analysis of amino acid composition has shown that residues located in the center of the interaction site can be better discriminated from the residues in other parts of the protein surface. In the present study, we implement a simple method to predict interaction site residues exploiting this fact and show that it achieves a very competitive performance compared to other methods using the same dataset and criteria for performance evaluation (success rate of 82.1%).  相似文献   

12.
A vector projection method is proposed to predict the cleavability of oligopeptides by extended-specificity site proteases. For an enzyme with eight specificity subsites the substrate octapeptide can be uniquely expressed as a vector in an 8-dimensional space, whose eight bases correspond to the amino acids at the eight subsites, P1, P1′, P2′, P3′ and P4′, respectively. The component of such a characteristic vector on each of the eight bases is defined as the frequency of an amino acid occurring at a given site. These frequencies were derived from a set of octapeptides known to be cleaved by HIV protease. The cleavability of an octapeptide can then be estimated from the projection of its characteristic vector on an idealized, optimally cleavable vector. The high ratio of correct prediction vs. total prediction for the data in both the training and the testing sets indicates that the new method is self-consistent and efficient. It provides a rapid and accurate algorithm for analyzing the specificity of any multisubsite enzyme for which there is no coupling between subsites. In particular, it is useful for predicting the cleavability of an oligopeptide by either HIV-1 or HIV-2 protease, and hence offers a supplementary means for finding effective inhibitors of HIV protease as potential drugs against AIDS. © Wiley-Liss, Inc.  相似文献   

13.
A method for predicting RNA structure from amino-acid sequence data   总被引:1,自引:0,他引:1  
A method is presented that enables computation of the stability of hairpin loops in an RNA chain, knowing only the amino-acid sequence translated from that chain. The method is based on a statistical decoding procedure and a thenno-dynamic evaluation of bonding in terms of free energy. The decoding takes into account all triplets of bases for each amino acid. Every possible hairpin loop formed by bonding between base pairs is examined by a computer program. Using thermodynamic criteria, it is decided which loops have a high probability of being bound. For these, a free energy of loop formation is evaluated in order to predict stability. The method is tested on 54 amino-acid sequences obtained by translating 18 different tRNA's. The loops predicted from the amino-acid sequence are compared with their actual counterparts in the original nucleotide sequence. A strong positive correlation is observed between the predicted and actual stabilities. A hairpin loop predicted to be stable from the amino-acid sequence has, on average, a probability of 0.5 of being stable, 0.3 of being metastable and 0.2 of being unstable, in the actual nucleotide sequence. Various applications of this method to obtain information about sequence and structure in messenger RNA's are discussed.  相似文献   

14.
李楠  李春 《生物信息学》2012,10(4):238-240
基于氨基酸的16种分类模型,给出蛋白质序列的派生序列,进而结合加权拟熵和LZ复杂度构造出34维特征向量来表示蛋白质序列。借助于贝叶斯分类器对同源性不超过25%的640数据集进行蛋白质结构类预测,准确度达到71.28%。  相似文献   

15.
16.
Nanni L  Lumini A 《Amino acids》2009,36(3):409-416
The focus of this work is the use of ensembles of classifiers for predicting HIV protease cleavage sites in proteins. Due to the complex relationships in the biological data, several recent works show that often ensembles of learning algorithms outperform stand-alone methods. We show that the fusion of approaches based on different encoding models can be useful for improving the performance of this classification problem. In particular, in this work four different feature encodings for peptides are described and tested. An extensive evaluation on a large dataset according to a blind testing protocol is reported which demonstrates how different feature extraction methods and classifiers can be combined for obtaining a robust and reliable system. The comparison with other stand-alone approaches allows quantifying the performance improvement obtained by the ensembles proposed in this work.  相似文献   

17.
Summary We have developed a theory to estimate the degree of sequence divergence between related DNAs from the comparison of restriction endonuclease recognition sites. Two major improvements have been made upon a similar method reported by Upholt (1977). First, the most probable value is calculated by the collective use of all available data. This reduces intrinsic statistical error and extends the analyzable range of sequence divergence. Second, all variables are redefined so that they have strict mathematical implications. This corrects a serious error arising from the misinterpretation of the meaning of the fraction of conserved cleavage sites. With this refined method, sequence divergence between rat and mouse mitochondrial DNAs (mtDNAs) was calculated to be about 25% substitutions/nucleotide, which is in good agreement with the DNA-DNA hybridization data obtained by Jakovcic et al. (1975). It was also estimated that the three types of rat mtDNAs differ from one another by 0.3 ~1% of total base pairs. These values are 2 ~5 times smaller than those obtained with the conventional method.  相似文献   

18.
Chou KC 《Proteins》2001,42(1):136-139
Protein signal sequences play a central role in the targeting and translocation of nearly all secreted proteins and many integral membrane proteins in both prokaryotes and eukaryotes. The knowledge of signal sequences has become a crucial tool for pharmaceutical scientists who genetically modify bacteria, plants, and animals to produce effective drugs. However, to effectively use such a tool, the first important thing is to find a fast and effective method to identify the "zipcode" entity; this is also evoked by both the huge amount of unprocessed data available and the industrial need to find more effective vehicles for the production of proteins in recombinant systems. In view of this, a sequence-encoded algorithm was developed to identify the signal sequences and predict their cleavage sites. The rate of correct prediction for 1,939 secretory proteins and 1,440 nonsecretory proteins by self-consistency test is 90.14% and that by jackknife test is 90.13%. The encouraging results indicate that the signal sequences share some common features although they lack similarity in sequence, length, and even composition and that they are predictable to a considerably accurate extent.  相似文献   

19.
Bacterial lipoproteins have many important functions and represent a class of possible vaccine candidates. The prediction of lipoproteins from sequence is thus an important task for computational vaccinology. Na?ve-Bayesian networks were trained to identify SpaseII cleavage sites and their preceding signal sequences using a set of 199 distinct lipoprotein sequences. A comprehensive range of sequence models was used to identify the best model for lipoprotein signal sequences. The best performing sequence model was found to be 10-residues in length, including the conserved cysteine lipid attachment site and the nine residues prior to it. The sensitivity of prediction for LipPred was 0.979, while the specificity was 0.742. Here, we describe LipPred, a web server for lipoprotein prediction; available at the URL: http://www.jenner.ac.uk/LipPred/. LipPred is the most accurate method available for the detection of SpaseIIcleaved lipoprotein signal sequences and the prediction of their cleavage sites.  相似文献   

20.
Y Wang  R Wu 《Nucleic acids research》1993,21(9):2143-2147
The development of methods for cleavage of DNA at specific site(s) that are widely spaced would facilitate physical mapping of large genomes. Several methods for rare and specific cleavage of chromosomal DNAs require a nearly complete methylation of a given type of restriction site except the one that is specifically protected. It is expected that as the target DNA increases in length, it will become less likely to achieve nearly complete methylation. The intron-encoded endonucleases may also provide a capability to cleave megabase-sized DNA segments due to their very large recognition sequences. However, there are endogenous cleavage sites in the chromosomes of most organisms. We present here a new method to specifically cleave intact chromosomal DNA using lambda-terminase. A plasmid containing two specific cleavage sites (cohesive-end sites) for lambda-terminase was specifically introduced into the E.coli genome and into chromosome V of S.cerevisiae. Chromosomal DNA was prepared from the resulting strains, and then cleaved with lambda-terminase. The results showed that the 4.7-megabase pair (Mb) circular E.coli chromosome and the 0.58-Mb linear yeast chromosome V were specifically cleaved at the desired sites with very high efficiencies. The approach of using the lambda-terminase cleavage reaction is a simple one-step procedure with a high specificity which is particularly suitable for mapping very large genomes of eucaryotes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号