共查询到20条相似文献,搜索用时 0 毫秒
1.
An approach of encoding for prediction of splice sites using SVM 总被引:1,自引:0,他引:1
In splice sites prediction, the accuracy is lower than 90% though the sequences adjacent to the splice sites have a high conservation. In order to improve the prediction accuracy, much attention has been paid to the improvement of the performance of the algorithms used, and few used for solving the fundamental issues, namely, nucleotide encoding. In this paper, a predictor is constructed to predict the true and false splice sites for higher eukaryotes based on support vector machines (SVM). Four types of encoding, which were mono-nucleotide (MN) encoding, MN with frequency difference between the true sites and false sites (FDTF) encoding, Pair-wise nucleotides (PN) encoding and PN with FDTF encoding, were applied to generate the input for the SVM. The results showed that PN with FDTF encoding as input to SVM led to the most reliable recognition of splice sites and the accuracy for the prediction of true donor sites and false sites were 96.3%, 93.7%, respectively, and the accuracy for predicting of true acceptor sites and false sites were 94.0%, 93.2%, respectively. 相似文献
2.
This article describes a novel method for predicting ligand-binding sites of proteins. This method uses only 8 structural properties as input vector to train 9 random forest classifiers which are combined to predict binding residues. These predicted binding residues are then clustered into some predicted ligand-binding sites. According to our measurement criterion, this method achieved a success rate of 0.914 in the bound state dataset and 0.800 in the unbound state dataset, which are better than three other methods: Q-SiteFinder, SCREEN and Morita's method. It indicates that the proposed method here is successful for predicting ligand-binding sites. 相似文献
3.
Improved prediction of protein-protein binding sites using a support vector machines approach 总被引:6,自引:0,他引:6
MOTIVATION: Structural genomics projects are beginning to produce protein structures with unknown function, therefore, accurate, automated predictors of protein function are required if all these structures are to be properly annotated in reasonable time. Identifying the interface between two interacting proteins provides important clues to the function of a protein and can reduce the search space required by docking algorithms to predict the structures of complexes. RESULTS: We have combined a support vector machine (SVM) approach with surface patch analysis to predict protein-protein binding sites. Using a leave-one-out cross-validation procedure, we were able to successfully predict the location of the binding site on 76% of our dataset made up of proteins with both transient and obligate interfaces. With heterogeneous cross-validation, where we trained the SVM on transient complexes to predict on obligate complexes (and vice versa), we still achieved comparable success rates to the leave-one-out cross-validation suggesting that sufficient properties are shared between transient and obligate interfaces. AVAILABILITY: A web application based on the method can be found at http://www.bioinformatics.leeds.ac.uk/ppi_pred. The dataset of 180 proteins used in this study is also available via the same web site. CONTACT: westhead@bmb.leeds.ac.uk SUPPLEMENTARY INFORMATION: http://www.bioinformatics.leeds.ac.uk/ppi-pred/supp-material. 相似文献
4.
Improved prediction of bacterial transcription start sites 总被引:2,自引:0,他引:2
Gordon JJ Towsey MW Hogan JM Mathews SA Timms P 《Bioinformatics (Oxford, England)》2006,22(2):142-148
5.
Palmitoylation is a universal and important lipid modification, involving a series of basic cellular processes, such as membrane trafficking, protein stability and protein aggregation. With the avalanche of new protein sequences generated in the post genomic era, it is highly desirable to develop computational methods for rapidly and effectively identifying the potential palmitoylation sites of uncharacterized proteins so as to timely provide useful information for revealing the mechanism of protein palmitoylation. By using the Incremental Feature Selection approach based on amino acid factors, conservation, disorder feature, and specific features of palmitoylation site, a new predictor named IFS-Palm was developed in this regard. The overall success rate thus achieved by jackknife test on a newly constructed benchmark dataset was 90.65%. It was shown via an in-depth analysis that palmitoylation was intimately correlated with the feature of the upstream residue directly adjacent to cysteine site as well as the conservation of amino acid cysteine. Meanwhile, the protein disorder region might also play an import role in the post-translational modification. These findings may provide useful insights for revealing the mechanisms of palmitoylation. 相似文献
6.
As a reversible posttranslational modification, protein palmitoylation has the potential to regulate the trafficking and function of a variety of proteins. However, the extent, function, and dynamic nature of palmitoylation are poorly resolved because of limitations in assay methods. Here, we introduce methods where hydroxylamine-mediated cleavage of the palmitoyl-thioester bond generates a free sulfhydryl, which can then be specifically labeled with sulfhydryl-reactive reagents. This methodology is more sensitive and allows for quantitative estimates of palmitoylation. Unlike other techniques used to assay posttranslational modifications, the techniques we have developed can label all sites of modification with a variety of probes, radiolabeled or nonradioactive, and can be used to assay the palmitoylation of proteins expressed in vivo in brain or other tissues. 相似文献
7.
8.
The spread of drug resistance through malaria parasite populations calls for the development of new therapeutic strategies. However, the seemingly promising genomics-driven target identification paradigm is hampered by the weak annotation coverage. To identify potentially important yet uncharacterized proteins, we apply support vector machines using profile kernels, a supervised discriminative machine learning technique for remote homology detection, as a complement to the traditional alignment based algorithms. In this study, we focus on the prediction of proteases, which have long been considered attractive drug targets because of their indispensable roles in parasite development and infection. Our analysis demonstrates that an abundant and complex repertoire is conserved in five Plasmodium parasite species. Several putative proteases may be important components in networks that mediate cellular processes, including hemoglobin digestion, invasion, trafficking, cell cycle fate, and signal transduction. This catalog of proteases provides a short list of targets for functional characterization and rational inhibitor design. Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users. Rui Kuang and Jianying Gu have contributed equally to this work. An erratum to this article can be found at 相似文献
9.
Posttranslational modification of tubulin by palmitoylation: II. Identification of sites of palmitoylation. 总被引:2,自引:2,他引:2
下载免费PDF全文

As shown in the companion article, tubulin is posttranslationally modified in vivo by palmitoylation. Our goal in this study was to identify the palmitoylation sites by protein structure analysis. To obtain quantities of palmitoylated tubulin required for this analysis, a cell-free system for enzymatic [3H]palmitoylation was developed and characterized in our companion article. We then developed a methodology to examine directly the palmitoylation of all 451 amino acids of alpha-tubulin. 3H-labeled palmitoylated alpha-tubulin was cleaved with cyanogen bromide (CNBr). The CNBr digest was resolved according to peptide size by gel filtration on Sephadex LH60 in formic acid:ethanol. The position of 3H-labeled palmitoylated amino acids in peptides could not be identified by analysis of the Edman degradation sequencer product because the palmitoylated sequencer products were lost during the final derivatization step to phenylthiohydantoin derivatives. Modification of the gas/liquid-phase sequencer to deliver the intermediate anilinothiozolinone derivative, rather than the phenylthiohydantoin derivative, identified the cycle containing the 3H-labeled palmitoylated residue. Therefore, structure analysis of peptides obtained from gel filtration necessitated dual sequencer runs of radioactive peptides, one for sequence analysis and one to identify 3H-labeled palmitoylated amino acids. Further cleavage of the CNBr peptides by trypsin and Lys-C protease, followed by gel filtration on Sephadex LH60 and dual sequencer runs, positioned the 3H-labeled palmitoylated amino acid residues in peptides. Integration of all the available structural information led to the assignment of the palmitoyl moiety to specific residues in alpha-tubulin. The palmitoylated residues in alpha-tubulin were confined to cysteine residues only. The major site for palmitoylation was cysteine residue 376. 相似文献
10.
Protein–RNA interactions play a key role in a number of biological processes such as protein synthesis, mRNA processing, assembly
and function of ribosomes and eukaryotic spliceosomes. A reliable identification of RNA-binding sites in RNA-binding proteins
is important for functional annotation and site-directed mutagenesis. We developed a novel method for the prediction of protein
residues that interact with RNA using support vector machine (SVM) and position-specific scoring matrices (PSSMs). Two cases
have been considered in the prediction of protein residues at RNA-binding surfaces. One is given the sequence information
of a protein chain that is known to interact with RNA; the other is given the structural information. Thus, five different
inputs have been tested. Coupled with PSI-BLAST profiles and predicted secondary structure, the present approach yields a
Matthews correlation coefficient (MCC) of 0.432 by a 7-fold cross-validation, which is the best among all previous reported
RNA-binding sites prediction methods. When given the structural information, we have obtained the MCC value of 0.457, with
PSSMs, observed secondary structure and solvent accessibility information assigned by DSSP as input. A web server implementing
the prediction method is available at the following URL: . 相似文献
11.
12.
Background
Protein subcellular localization is an important determinant of protein function and hence, reliable methods for prediction of localization are needed. A number of prediction algorithms have been developed based on amino acid compositions or on the N-terminal characteristics (signal peptides) of proteins. However, such approaches lead to a loss of contextual information. Moreover, where information about the physicochemical properties of amino acids has been used, the methods employed to exploit that information are less than optimal and could use the information more effectively. 相似文献13.
Zhen Chen Yuan Zhou Jiangning Song Ziding Zhang 《Biochimica et Biophysica Acta - Proteins and Proteomics》2013,1834(8):1461-1467
As one of the most common post-translational modifications, ubiquitination regulates the quantity and function of a variety of proteins. Experimental and clinical investigations have also suggested the crucial roles of ubiquitination in several human diseases. The complicated sequence context of human ubiquitination sites revealed by proteomic studies highlights the need of developing effective computational strategies to predict human ubiquitination sites. Here we report the establishment of a novel human-specific ubiquitination site predictor through the integration of multiple complementary classifiers. Firstly, a Support Vector Machine (SVM) classier was constructed based on the composition of k-spaced amino acid pairs (CKSAAP) encoding, which has been utilized in our previous yeast ubiquitination site predictor. To further exploit the pattern and properties of the ubiquitination sites and their flanking residues, three additional SVM classifiers were constructed using the binary amino acid encoding, the AAindex physicochemical property encoding and the protein aggregation propensity encoding, respectively. Through an integration that relied on logistic regression, the resulting predictor termed hCKSAAP_UbSite achieved an area under ROC curve (AUC) of 0.770 in 5-fold cross-validation test on a class-balanced training dataset. When tested on a class-balanced independent testing dataset that contains 3419 ubiquitination sites, hCKSAAP_UbSite has also achieved a robust performance with an AUC of 0.757. Specifically, it has consistently performed better than the predictor using the CKSAAP encoding alone and two other publicly available predictors which are not human-specific. Given its promising performance in our large-scale datasets, hCKSAAP_UbSite has been made publicly available at our server (http://protein.cau.edu.cn/cksaap_ubsite/). 相似文献
14.
Davis FP 《Molecular bioSystems》2011,7(2):545-557
Small molecules that modulate protein-protein interactions are of great interest for chemical biology and therapeutics. Here I present a structure-based approach to predict 'bi-functional' sites able to bind both small molecule ligands and proteins, in proteins of unknown structure. First, I develop a homology-based annotation method that transfers binding sites of known three-dimensional structure onto protein sequences, predicting residues in ligand and protein binding sites with estimated true positive rates of 98% and 88%, respectively, at 1% false positive rates. Applying this method to the human proteome predicts 8463 proteins with bi-functional residues and correctly recovers the targets of known interaction modulators. Proteins with significantly (p < 0.01) more bi-functional residues than expected were found to be enriched in regulatory and depleted in metabolism functions. Finally, I demonstrate the utility of the method by describing examples of predicted overlap and evidence of their biological and therapeutic relevance. The results suggest that combining the structures of known binding sites with established fold detection algorithms can predict regions of protein-protein interfaces that are amenable to small molecule modulation. Open-source software and the results for several complete proteomes are available at http://pibase.janelia.org/homolobind. 相似文献
15.
Improved localization of intracellular sites of phosphatases using cerium and cell permeabilization 总被引:8,自引:0,他引:8
J M Robinson 《The journal of histochemistry and cytochemistry》1985,33(8):749-754
A simple permeabilization method has been developed that allows for intracellular localization of acid phosphatase in neutrophils and several types of tissue culture cells with cerium. This permeabilization procedure also facilitates intracellular alkaline phosphatase localization in neutrophils without the loss of cell surface reaction in this cell type. Only the cell surface reaction was detected in the absence of permeabilization. Glutaraldehyde-fixed cells were permeabilized with detergent during the cytochemical reaction. Triton X-100 at 0.0001-0.0002% gave the best results for the enzymes and cell types tested. 相似文献
16.
蛋白质结构预测是现代计算生物领域最重要的问题之一,而蛋白质二级结构预测是蛋白质高级结构预测的基础。目前蛋白质二级结构的预测方法较多,其中SVM方法取得了较高的预测精度。重在阐述使用SVM用于蛋白质二级结构预测的步骤,以及与其他方法进行比较时应该注意的事项,为下一步的研究提供参考及启发。 相似文献
17.
Background
Ubiquitination, which is also called “lysine ubiquitination”, occurs when an ubiquitin is attached to lysine (K) residues in targeting proteins. As one of the most important post translational modifications (PTMs), it plays the significant role not only in protein degradation, but also in other cellular functions. Thus, systematic anatomy of the ubiquitination proteome is an appealing and challenging research topic. The existing methods for identifying protein ubiquitination sites can be divided into two kinds: mass spectrometry and computational methods. Mass spectrometry-based experimental methods can discover ubiquitination sites from eukaryotes, but are time-consuming and expensive. Therefore, it is priority to develop computational approaches that can effectively and accurately identify protein ubiquitination sites.Results
The existing computational methods usually require feature engineering, which may lead to redundancy and biased representations. While deep learning is able to excavate underlying characteristics from large-scale training data via multiple-layer networks and non-linear mapping operations. In this paper, we proposed a deep architecture within multiple modalities to identify the ubiquitination sites. First, according to prior knowledge and biological knowledge, we encoded protein sequence fragments around candidate ubiquitination sites into three modalities, namely raw protein sequence fragments, physico-chemical properties and sequence profiles, and designed different deep network layers to extract the hidden representations from them. Then, the generative deep representations corresponding to three modalities were merged to build the final model. We performed our algorithm on the available largest scale protein ubiquitination sites database PLMD, and achieved 66.4% specificity, 66.7% sensitivity, 66.43% accuracy, and 0.221 MCC value. A number of comparative experiments also indicated that our multimodal deep architecture outperformed several popular protein ubiquitination site prediction tools.Conclusion
The results of comparative experiments validated the effectiveness of our deep network and also displayed that our method outperformed several popular protein ubiquitination site prediction tools. The source codes of our proposed method are available at https://github.com/jiagenlee/deepUbiquitylation.18.
19.
The current available data on protein sequences largely exceeds the experimental capabilities to annotate their function. So annotation in silico, i.e. using computational methods becomes increasingly important. This annotation is inevitably a prediction, but it can be an important starting point for further experimental studies. Here we present a method for prediction of protein functional sites, SDPsite, based on the identification of protein specificity determinants. Taking as an input a protein sequence alignment and a phylogenetic tree, the algorithm predicts conserved positions and specificity determinants, maps them onto the protein's 3D structure, and searches for clusters of the predicted positions. Comparison of the obtained predictions with experimental data and data on performance of several other methods for prediction of functional sites reveals that SDPsite agrees well with the experiment and outperforms most of the previously available methods. SDPsite is publicly available under http://bioinf.fbb.msu.ru/SDPsite. 相似文献