首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 859 毫秒
1.
Tyrosine sulfation is a ubiquitous posttranslational modification that regulates extracellular protein-protein interactions, intracellular protein transportation modulation, and protein proteolytic process. However, identifying tyrosine sulfation sites remains a challenge due to the lability of sulfation sequences. In this study, we developed a method called PredSulSite that incorporates protein secondary structure, physicochemical properties of amino acids, and residue sequence order information based on support vector machine to predict sulfotyrosine sites. Three types of encoding algorithms-secondary structure, grouped weight, and autocorrelation function-were applied to mine features from tyrosine sulfation proteins. The prediction model with multiple features achieved an accuracy of 92.89% in 10-fold cross-validation. Feature analysis showed that the coil structure, acidic amino acids, and residue interactions around the tyrosine sulfation sites all contributed to the sulfation site determination. The detailed feature analysis in this work can help us to understand the sulfation mechanism and provide guidance for the related experimental validation. PredSulSite is available as a community resource at http://www.bioinfo.ncu.edu.cn/inquiries_PredSulSite.aspx.  相似文献   

2.
Zheng LL  Niu S  Hao P  Feng K  Cai YD  Li Y 《PloS one》2011,6(12):e28221
Pyrrolidone carboxylic acid (PCA) is formed during a common post-translational modification (PTM) of extracellular and multi-pass membrane proteins. In this study, we developed a new predictor to predict the modification sites of PCA based on maximum relevance minimum redundancy (mRMR) and incremental feature selection (IFS). We incorporated 727 features that belonged to 7 kinds of protein properties to predict the modification sites, including sequence conservation, residual disorder, amino acid factor, secondary structure and solvent accessibility, gain/loss of amino acid during evolution, propensity of amino acid to be conserved at protein-protein interface and protein surface, and deviation of side chain carbon atom number. Among these 727 features, 244 features were selected by mRMR and IFS as the optimized features for the prediction, with which the prediction model achieved a maximum of MCC of 0.7812. Feature analysis showed that all feature types contributed to the modification process. Further site-specific feature analysis showed that the features derived from PCA's surrounding sites contributed more to the determination of PCA sites than other sites. The detailed feature analysis in this paper might provide important clues for understanding the mechanism of the PCA formation and guide relevant experimental validations.  相似文献   

3.
Protein–DNA interactions play important roles in many biological processes. To understand the molecular mechanisms of protein–DNA interaction, it is necessary to identify the DNA-binding sites in DNA-binding proteins. In the last decade, computational approaches have been developed to predict protein–DNA-binding sites based solely on protein sequences. In this study, we developed a novel predictor based on support vector machine algorithm coupled with the maximum relevance minimum redundancy method followed by incremental feature selection. We incorporated not only features of physicochemical/biochemical properties, sequence conservation, residual disorder, secondary structure, solvent accessibility, but also five three-dimensional (3D) structural features calculated from PDB data to predict the protein–DNA interaction sites. Feature analysis showed that 3D structural features indeed contributed to the prediction of DNA-binding site and it was demonstrated that the prediction performance was better with 3D structural features than without them. It was also shown via analysis of features from each site that the features of DNA-binding site itself contribute the most to the prediction. Our prediction method may become a useful tool for identifying the DNA-binding sites and the feature analysis described in this paper may provide useful insights for in-depth investigations into the mechanisms of protein–DNA interaction.  相似文献   

4.
5.
Li BQ  Hu LL  Niu S  Cai YD  Chou KC 《Journal of Proteomics》2012,75(5):1654-1665
S-nitrosylation (SNO) is one of the most important and universal post-translational modifications (PTMs) which regulates various cellular functions and signaling events. Identification of the exact S-nitrosylation sites in proteins may facilitate the understanding of the molecular mechanisms and biological function of S-nitrosylation. Unfortunately, traditional experimental approaches used for detecting S-nitrosylation sites are often laborious and time-consuming. However, computational methods could overcome this demerit. In this work, we developed a novel predictor based on nearest neighbor algorithm (NNA) with the maximum relevance minimum redundancy (mRMR) method followed by incremental feature selection (IFS). The features of physicochemical/biochemical properties, sequence conservation, residual disorder, amino acid occurrence frequency, second structure and the solvent accessibility were utilized to represent the peptides concerned. Feature analysis showed that the features except residual disorder affected identification of the S-nitrosylation sites. It was also shown via the site-specific feature analysis that the features of sites away from the central cysteine might contribute to the S-nitrosylation site determination through a subtle manner. It is anticipated that our prediction method may become a useful tool for identifying the protein S-nitrosylation sites and that the features analysis described in this paper may provide useful insights for in-depth investigation into the mechanism of S-nitrosylation.  相似文献   

6.
BQ Li  KY Feng  L Chen  T Huang  YD Cai 《PloS one》2012,7(8):e43927
Prediction of protein-protein interaction (PPI) sites is one of the most challenging problems in computational biology. Although great progress has been made by employing various machine learning approaches with numerous characteristic features, the problem is still far from being solved. In this study, we developed a novel predictor based on Random Forest (RF) algorithm with the Minimum Redundancy Maximal Relevance (mRMR) method followed by incremental feature selection (IFS). We incorporated features of physicochemical/biochemical properties, sequence conservation, residual disorder, secondary structure and solvent accessibility. We also included five 3D structural features to predict protein-protein interaction sites and achieved an overall accuracy of 0.672997 and MCC of 0.347977. Feature analysis showed that 3D structural features such as Depth Index (DPX) and surface curvature (SC) contributed most to the prediction of protein-protein interaction sites. It was also shown via site-specific feature analysis that the features of individual residues from PPI sites contribute most to the determination of protein-protein interaction sites. It is anticipated that our prediction method will become a useful tool for identifying PPI sites, and that the feature analysis described in this paper will provide useful insights into the mechanisms of interaction.  相似文献   

7.
Cai Y  Huang T  Hu L  Shi X  Xie L  Li Y 《Amino acids》2012,42(4):1387-1395
Ubiquitination, one of the most important post-translational modifications of proteins, occurs when ubiquitin (a small 76-amino acid protein) is attached to lysine on a target protein. It often commits the labeled protein to degradation and plays important roles in regulating many cellular processes implicated in a variety of diseases. Since ubiquitination is rapid and reversible, it is time-consuming and labor-intensive to identify ubiquitination sites using conventional experimental approaches. To efficiently discover lysine-ubiquitination sites, a sequence-based predictor of ubiquitination site was developed based on nearest neighbor algorithm. We used the maximum relevance and minimum redundancy principle to identify the key features and the incremental feature selection procedure to optimize the prediction engine. PSSM conservation scores, amino acid factors and disorder scores of the surrounding sequence formed the optimized 456 features. The Mathew’s correlation coefficient (MCC) of our ubiquitination site predictor achieved 0.142 by jackknife cross-validation test on a large benchmark dataset. In independent test, the MCC of our method was 0.139, higher than the existing ubiquitination site predictor UbiPred and UbPred. The MCCs of UbiPred and UbPred on the same test set were 0.135 and 0.117, respectively. Our analysis shows that the conservation of amino acids at and around lysine plays an important role in ubiquitination site prediction. What’s more, disorder and ubiquitination have a strong relevance. These findings might provide useful insights for studying the mechanisms of ubiquitination and modulating the ubiquitination pathway, potentially leading to potential therapeutic strategies in the future.  相似文献   

8.
Carboxy-terminal α-amidation is a widespread post-translational modification of proteins found widely in vertebrates and invertebrates. The α-amide group is required for full biological activity, since it may render a peptide more hydrophobic and thus better be able to bind to other proteins, preventing ionization of the C-terminus. However, in particular, the C-terminal amidation is very difficult to detect because experimental methods are often labor-intensive, time-consuming and expensive. Therefore, in silico methods may complement due to their high efficiency. In this study, a computational method was developed to predict protein amidation sites, by incorporating the maximum relevance minimum redundancy method and the incremental feature selection method based on the nearest neighbor algorithm. From a total of 735 features, 41 optimal features were selected and were utilized to construct the final predictor. As a result, the predictor achieved an overall Matthews correlation coefficient of 0.8308. Feature analysis showed that PSSM conservation scores and amino acid factors played the most important roles in the α-amidation site prediction. Site-specific feature analyses showed that features derived from the amidation site itself and adjacent sites were most significant. This method presented could be used as an efficient tool to theoretically predict amidated peptides. And the selected features from our study could shed some light on the in-depth understanding of the mechanisms of the amidation modification, providing guidelines for experimental validation.  相似文献   

9.
Tyrosylprotein sulfotransferase (TPST) catalyzes the sulfation of proteins at tyrosine residues. We have analyzed the substrate specificity of TPST from bovine adrenal medulla with a novel assay, using synthetic peptides as substrates. The peptides were modeled after the known, or putative, tyrosine sulfation sites of the cholecystokinin precursor, chromogranin B (secretogranin I) and vitronectin, as well as the tyrosine phosphorylation sites of alpha-tubulin and pp60src. Varying the sequence of these peptides, we found that (i) the apparent Km of peptides with multiple tyrosine sulfation sites decreased exponentially with the number of sites; (ii) acidic amino acids were the major determinant for tyrosine sulfation, acidic amino acids adjacent to the tyrosine being more important than distant ones; (iii) a carboxyl terminally located tyrosine residue may be sulfated. Moreover, TPST catalyzed the sulfation of a peptide corresponding to the tyrosine autophosphorylation site of pp60v-src (Tyr-416) but not of a peptide corresponding to the non-autophosphorylation site of pp60c-src (Tyr-527). These results experimentally define structural determinants for the substrate specificity of TPST and show that this enzyme and certain autophosphorylating tyrosine kinases have overlapping substrate specificities in vitro.  相似文献   

10.
Our previous results showed that sulfated tyrosines of thyroglobulin (Tg), the molecular support of thyroid hormonosynthesis, are involved in the hormonogenic process. Moreover, the consensus sequence required for tyrosine sulfation is present in most of the hormonogenic sites. These observations suggest that tyrosine sulfation might play a critical role in the hormonogenic process. In this paper we studied the putative sulfation of tyrosine 5 contained in the preferential hormonogenic site. Porcine thyrocytes were cultured with thyrotropin but without iodide to preserve the sulfation state of tyrosine 5 and then incubated or not with [35S]sulfate. Secreted Tg was purified and submitted to peptide sequence analysis which confirmed the known peptide sequence of the NH(2) extremity of Tg:NIFEYQV. The treatment of [35S]sulfate-labeled Tg by leucine aminopeptidase, which sequentially digested its amino-terminal extremity, released the same amino acids and further analysis by thin layer chromatography showed that the tyrosine was sulfated. We concluded that tyrosine 5 is sulfated but the role of sulfate group in the hormonogenic process remains to be elucidated.  相似文献   

11.
Hu LL  Niu S  Huang T  Wang K  Shi XH  Cai YD 《PloS one》2010,5(12):e15917

Background

Hydroxylation is an important post-translational modification and closely related to various diseases. Besides the biotechnology experiments, in silico prediction methods are alternative ways to identify the potential hydroxylation sites.

Methodology/Principal Findings

In this study, we developed a novel sequence-based method for identifying the two main types of hydroxylation sites – hydroxyproline and hydroxylysine. First, feature selection was made on three kinds of features consisting of amino acid indices (AAindex) which includes various physicochemical properties and biochemical properties of amino acids, Position-Specific Scoring Matrices (PSSM) which represent evolution information of amino acids and structural disorder of amino acids in the sliding window with length of 13 amino acids, then the prediction model were built using incremental feature selection method. As a result, the prediction accuracies are 76.0% and 82.1%, evaluated by jackknife cross-validation on the hydroxyproline dataset and hydroxylysine dataset, respectively. Feature analysis suggested that physicochemical properties and biochemical properties and evolution information of amino acids contribute much to the identification of the protein hydroxylation sites, while structural disorder had little relation to protein hydroxylation. It was also found that the amino acid adjacent to the hydroxylation site tends to exert more influence than other sites on hydroxylation determination.

Conclusions/Significance

These findings may provide useful insights for exploiting the mechanisms of hydroxylation.  相似文献   

12.
Phosphorylation is one of the most important post-translational modifications, and the identification of protein phosphorylation sites is particularly important for studying disease diagnosis. However, experimental detection of phosphorylation sites is labor intensive. It would be beneficial if computational methods are available to provide an extra reference for the phosphorylation sites. Here we developed a novel sequence-based method for serine, threonine, and tyrosine phosphorylation site prediction. Nearest Neighbor algorithm was employed as the prediction engine. The peptides around the phosphorylation sites with a fixed length of thirteen amino acid residues were extracted via a sliding window along the protein chains concerned. Each of such peptides was coded into a vector with 6,072 features, derived from Amino Acid Index (AAIndex) database, for the classification/detection. Incremental Feature Selection, a feature selection algorithm based on the Maximum Relevancy Minimum Redundancy (mRMR) method was used to select a compact feature set for a further improvement of the classification performance. Three predictors were established for identifying the three types of phosphorylation sites, achieving the overall accuracies of 66.64%, 66.11%% and 66.69%, respectively. These rates were obtained by rigorous jackknife cross-validation tests.  相似文献   

13.
Hu LL  Wan SB  Niu S  Shi XH  Li HP  Cai YD  Chou KC 《Biochimie》2011,93(3):489-496
Palmitoylation is a universal and important lipid modification, involving a series of basic cellular processes, such as membrane trafficking, protein stability and protein aggregation. With the avalanche of new protein sequences generated in the post genomic era, it is highly desirable to develop computational methods for rapidly and effectively identifying the potential palmitoylation sites of uncharacterized proteins so as to timely provide useful information for revealing the mechanism of protein palmitoylation. By using the Incremental Feature Selection approach based on amino acid factors, conservation, disorder feature, and specific features of palmitoylation site, a new predictor named IFS-Palm was developed in this regard. The overall success rate thus achieved by jackknife test on a newly constructed benchmark dataset was 90.65%. It was shown via an in-depth analysis that palmitoylation was intimately correlated with the feature of the upstream residue directly adjacent to cysteine site as well as the conservation of amino acid cysteine. Meanwhile, the protein disorder region might also play an import role in the post-translational modification. These findings may provide useful insights for revealing the mechanisms of palmitoylation.  相似文献   

14.
15.
Prediction of tyrosine sulfation sites in animal viruses   总被引:1,自引:0,他引:1  
Post-translational modification of proteins by tyrosine sulfation enhances the affinity of extracellular ligand-receptor interactions important in the immune response and other biological processes in animals. For example, sulfated tyrosines in polyomavirus and varicella-zoster virus may help modulate host cell recognition and facilitate viral attachment and entry. Using a Position-Specific-Scoring-Matrix with an accuracy of 96.43%, we analyzed the possibility of tyrosine sulfation in all 1517 animal viruses available in the Swiss-Prot database. From a total of 97,729 tyrosines, we predicted 5091 sulfated tyrosine sites from 1024 viruses. Our site predictions in hemagglutinin of influenza A, VP4 of rotavirus, and US28 of cytomegalovirus strongly suggest an important link between tyrosine sulfation and viral disease mechanisms. In each of these three viral proteins, we observed highly conserved amino acid sequences surrounding predicted sulfated tyrosine sites. Tyrosine sulfation appears to be much more common in animal viruses than is currently recognized.  相似文献   

16.
We present a new method for predicting protein–ligand-binding sites based on protein three-dimensional structure and amino acid conservation. This method involves calculation of the van der Waals interaction energy between a protein and many probes placed on the protein surface and subsequent clustering of the probes with low interaction energies to identify the most energetically favorable locus. In addition, it uses amino acid conservation among homologous proteins. Ligand-binding sites were predicted by combining the interaction energy and the amino acid conservation score. The performance of our prediction method was evaluated using a non-redundant dataset of 348 ligand-bound and ligand-unbound protein structure pairs, constructed by filtering entries in a ligand-binding site structure database, LigASite. Ligand-bound structure prediction (bound prediction) indicated that 74.0 % of predicted ligand-binding sites overlapped with real ligand-binding sites by over 25 % of their volume. Ligand-unbound structure prediction (unbound prediction) indicated that 73.9 % of predicted ligand-binding residues overlapped with real ligand-binding residues. The amino acid conservation score improved the average prediction accuracy by 17.0 and 17.6 points for the bound and unbound predictions, respectively. These results demonstrate the effectiveness of the combined use of the interaction energy and amino acid conservation in the ligand-binding site prediction.  相似文献   

17.
Protein tyrosine sulfation is an important post-translational modification of proteins that go through the secretory pathway. No clear-cut acceptor motif can be defined that allows the prediction of tyrosine sulfation sites in polypeptide chains. The Sulfinator is a software tool that can be used to predict tyrosine sulfation sites in protein sequences with an overall accuracy of 98%. Four different Hidden Markov Models were constructed, each of them specialized to recognize sulfated tyrosine residues depending on their location within the sequence: near the N-terminus, near the C-terminus, in the center of a window with a size of at least 25 amino acids, as well as in windows containing several tyrosine residues. AVAILABILITY: The Sulfinator is accessible at (http://www.expasy.org/tools/sulfinator/). Supplementary information: Sulfinator documentation is accessible at (http://www.expasy.org/tools/sulfinator/sulfinator-doc.html).  相似文献   

18.
Microfibril-associated glycoprotein-1 (MAGP-1) is a small molecular weight protein associated with extracellular matrix microfibrils. Biochemical studies have shown that MAGP-1 undergoes several posttranslational modifications that may influence its associations with other microfibrillar components. To identify the sites in the molecule where posttranslational modifications occur, we expressed MAGP-1 constructs containing various point mutations as well as front and back half truncations in CHO cells. Characterization of transiently expressed protein showed that MAGP-1 undergoes O-linked glycosylation and tyrosine sulfation at sites in its amino-terminal half. This region of the protein also served as a major amine acceptor site for transglutaminase and mediated self-assembly into high molecular weight multimers through a glutamine-rich sequence. Fine mapping of the modification sites through mutational analysis demonstrated that Gln20 is a major amine acceptor site for the transglutaminase reaction and confirmed that a canonical tyrosine sulfation consensus sequence is the site of MAGP-1 sulfation. Our results also show that O-glycosylation occurs at more than one site in the molecule.  相似文献   

19.
J R Bundgaard  J Vuust    J F Rehfeld 《The EMBO journal》1995,14(13):3073-3079
Tyrosine O-sulfation is a common post-translational modification of secretory and membrane proteins. The biological function of sulfation is known in only a few proteins, where it appears to enhance protein-protein interactions. Based on known sequences around sulfated tyrosines, a consensus sequence for prediction of target tyrosines has been proposed. However, some proteins are tyrosine sulfated at sites that deviate from the proposed consensus. Among these is progastrin. It is possible that the deviation explains the incomplete sulfation characteristic for bioactive gastrin peptides. In order to test this hypothesis, we have performed site-directed mutagenesis of the gastrin gene followed by heterologous expression in an endocrine cell line. The results show that substitution of the alanyl residue immediately N-terminal to the sulfated tyrosine with an acidic amino acid promotes the sulfation of gastrin peptides. Hence, the study supports the proposed consensus sequence for tyrosine sulfation. Importantly, however, the results also reveal that complete sulfation increases the endoproteolytic maturation of progastrin. Thus, our study suggests an additional function for tyrosine sulfation of possible general significance.  相似文献   

20.
Proteinases play critical roles in both intra and extracellular processes by binding and cleaving their protein substrates. The cleavage can either be non-specific as part of degradation during protein catabolism or highly specific as part of proteolytic cascades and signal transduction events. Identification of these targets is extremely challenging. Current computational approaches for predicting cleavage sites are very limited since they mainly represent the amino acid sequences as patterns or frequency matrices. In this work, we developed a novel predictor based on Random Forest algorithm (RF) using maximum relevance minimum redundancy (mRMR) method followed by incremental feature selection (IFS). The features of physicochemical/biochemical properties, sequence conservation, residual disorder, amino acid occurrence frequency, secondary structure and solvent accessibility were utilized to represent the peptides concerned. Here, we compared existing prediction tools which are available for predicting possible cleavage sites in candidate substrates with ours. It is shown that our method makes much more reliable predictions in terms of the overall prediction accuracy. In addition, this predictor allows the use of a wide range of proteinases.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号