首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
The AutoMotif Server allows for identification of post-translational modification (PTM) sites in proteins based only on local sequence information. The local sequence preferences of short segments around PTM residues are described here as linear functional motifs (LFMs). Sequence models for all types of PTMs are trained by support vector machine on short-sequence fragments of proteins in the current release of Swiss-Prot database (phosphorylation by various protein kinases, sulfation, acetylation, methylation, amidation, etc.). The accuracy of the identification is estimated using the standard leave-one-out procedure. The sensitivities for all types of short LFMs are in the range of 70%. AVAILABILITY: The AutoMotif Server is available free for academic use at http://automotif.bioinfo.pl/  相似文献   

3.

Background  

We present here the recent update of AMS algorithm for identification of post-translational modification (PTM) sites in proteins based only on sequence information, using artificial neural network (ANN) method. The query protein sequence is dissected into overlapping short sequence segments. Ten different physicochemical features describe each amino acid; therefore nine residues long segment is represented as a point in a 90 dimensional space. The database of sequence segments with confirmed by experiments post-translational modification sites are used for training a set of ANNs.  相似文献   

4.
We present here the recent update of AutoMotif Server (AMS 2.0) that predicts post-translational modification sites in protein sequences. The support vector machine (SVM) algorithm was trained on data gathered in 2007 from various sets of proteins containing experimentally verified chemical modifications of proteins. Short sequence segments around a modification site were dissected from a parent protein, and represented in the training set as binary or profile vectors. The updated efficiency of the SVM classification for each type of modification and the predictive power of both representations were estimated using leave-one-out tests for model of general phosphorylation and for modifications catalyzed by several specific protein kinases. The accuracy of the method was improved in comparison to the previous version of the service (Plewczynski et al., “AutoMotif server: prediction of single residue post-translational modifications in proteins”, Bioinformatics 21: 2525–7, 2005). The precision of the updated version reached over 90% for selected types of phosphorylation and was optimized in trade of lower recall value of the classification model. The AutoMotif Server version 2007 is freely available at . Additionally, the reference dataset for optimization of prediction of phosphorylation sites, collected from the UniProtKB was also provided and can be accessed at .  相似文献   

5.
Various post‐translational modifications (PTMs) fine‐tune the functions of almost all eukaryotic proteins, and co‐regulation of different types of PTMs has been shown within and between a number of proteins. Aiming at a more global view of the interplay between PTM types, we collected modifications for 13 frequent PTM types in 8 eukaryotes, compared their speed of evolution and developed a method for measuring PTM co‐evolution within proteins based on the co‐occurrence of sites across eukaryotes. As many sites are still to be discovered, this is a considerable underestimate, yet, assuming that most co‐evolving PTMs are functionally associated, we found that PTM types are vastly interconnected, forming a global network that comprise in human alone >50 000 residues in about 6000 proteins. We predict substantial PTM type interplay in secreted and membrane‐associated proteins and in the context of particular protein domains and short‐linear motifs. The global network of co‐evolving PTM types implies a complex and intertwined post‐translational regulation landscape that is likely to regulate multiple functional states of many if not all eukaryotic proteins.  相似文献   

6.
Proteomic patterns as a potential diagnostic technology has been well established for several cancer conditions and other diseases. The use of machine learning techniques such as decision trees, neural networks, genetic algorithms, and other methods has been the basis for pattern determination. Cancer is known to involve signaling pathways that are regulated through PTM of proteins. These modifications are also detectable with high confidence using high-resolution MS. We generated data using a prOTOF mass spectrometer on two sets of patient samples: ovarian cancer and cutaneous t-cell lymphoma (CTCL) with matched normal samples for each disease. Using the knowledge of mass shifts caused by common modifications, we built models using peak pairs and compared this to a conventional technique using individual peaks. The results for each disease showed that a small number of peak pairs gave classification equal to or better than the conventional technique that used multiple individual peaks. This simple peak picking technique could be used to guide identification of important peak pairs involved in the disease process.  相似文献   

7.
Post-translational modifications (PTMs) are crucial steps in protein synthesis and are important factors contributing to protein diversity. PTMs play important roles in the regulation of gene expression, protein stability and metabolism. Lysine residues in protein sequences have been found to be targeted for both types of PTMs: sumoylations and acetylations; however, each PTM has a different cellular role. As experimental approaches are often laborious and time consuming, it is challenging to distinguish the two types of PTMs on lysine residues using computational methods. In this study, we developed a method to discriminate between sumoylated lysine residues and acetylated residues. The method incorporated several features: PSSM conservation scores, amino acid factors, secondary structures, solvent accessibilities and disorder scores. By using the mRMR (Maximum Relevance Minimum Redundancy) method and the IFS (Incremental Feature Selection) method, an optimal feature set was selected from all of the incorporated features, with which the classifier achieved 92.14% accuracy with an MCC value of 0.7322. Analysis of the optimal feature set revealed some differences between acetylation and sumoylation. The results from our study also supported the previous finding that there exist different consensus motifs for the two types of PTMs. The results could suggest possible dominant factors governing the acetylation and sumoylation of lysine residues, shedding some light on the modification dynamics and molecular mechanisms of the two types of PTMs, and provide guidelines for experimental validations.  相似文献   

8.
Post‐translational modifications (PTMs) of proteins are central in any kind of cellular signaling. Modern mass spectrometry technologies enable comprehensive identification and quantification of various PTMs. Given the increased numbers and types of mapped protein modifications, a database is necessary that simultaneously integrates and compares site‐specific information for different PTMs, especially in plants for which the available PTM data are poorly catalogued. Here, we present the Plant PTM Viewer (http://www.psb.ugent.be/PlantPTMViewer), an integrative PTM resource that comprises approximately 370 000 PTM sites for 19 types of protein modifications in plant proteins from five different species. The Plant PTM Viewer provides the user with a protein sequence overview in which the experimentally evidenced PTMs are highlighted together with an estimate of the confidence by which the modified peptides and, if possible, the actual modification sites were identified and with functional protein domains or active site residues. The PTM sequence search tool can query PTM combinations in specific protein sequences, whereas the PTM BLAST tool searches for modified protein sequences to detect conserved PTMs in homologous sequences. Taken together, these tools help to assume the role and potential interplay of PTMs in specific proteins or within a broader systems biology context. The Plant PTM Viewer is an open repository that allows the submission of mass spectrometry‐based PTM data to remain at pace with future PTM plant studies.  相似文献   

9.
10.
Post‐translational modifications (PTMs) represent an important regulatory layer influencing the structure and function of proteins. With broader availability of experimental information on the occurrences of different PTM types, the investigation of a potential “crosstalk” between different PTM types and combinatorial effects have moved into the research focus. Hypothesizing that relevant interferences between different PTM types and sites may become apparent when investigating their mutual physical distances, we performed a systematic survey of pairwise homo‐ and heterotypic distances of seven frequent PTM types considering their sequence and spatial distances in resolved protein structures. We found that actual PTM site distance distributions differ from random distributions with most PTM type pairs exhibiting larger than expected distances with the exception of homotypic phosphorylation site distances and distances between phosphorylation and ubiquitination sites that were found to be closer than expected by chance. Random reference distributions considering canonical acceptor amino acid residues only were found to be shifted to larger distances compared to distances between any amino acid residue type indicating an underlying tendency of PTM‐amenable residue types to be further apart than randomly expected. Distance distributions based on sequence separations were found largely consistent with their spatial counterparts suggesting a primary role of sequence‐based pairwise PTM‐location encoding rather than folding‐mediated effects. Our analysis provides a systematic and comprehensive overview of the characteristics of pairwise PTM site distances on proteins and reveals that, predominantly, PTM sites tend to avoid close proximity with the potential implication that an independent attachment or removal of PTMs remains possible. Proteins 2016; 85:78–92. © 2016 Wiley Periodicals, Inc.  相似文献   

11.
DNA sequence classification is the activity of determining whether or not an unlabeled sequence S belongs to an existing class C. This paper proposes two new techniques for DNA sequence classification. The first technique works by comparing the unlabeled sequence S with a group of active motifs discovered from the elements of C and by distinction with elements outside of C. The second technique generates and matches gapped fingerprints of S with elements of C. Experimental results obtained by running these algorithms on long and well conserved Alu sequences demonstrate the good performance of the presented methods compared with FASTA. When applied to less conserved and relatively short functional sites such as splice-junctions, a variation of the second technique combining fingerprinting with consensus sequence analysis gives better results than the current classifiers employing text compression and machine learning algorithms.  相似文献   

12.
Our algorithm predicts short linear functional motifs in proteins using only sequence information. Statistical models for short linear functional motifs in proteins are built using the database of short sequence fragments taken from proteins in the current release of the Swiss-Prot database. Those segments are confirmed by experiments to have single-residue post-translational modification. The sensitivities of the classification for various types of short linear motifs are in the range of 70%. The query protein sequence is dissected into short overlapping fragments. All segments are represented as vectors. Each vector is then classified by a machine learning algorithm (Support Vector Machine) as potentially modifiable or not. The resulting list of plausible post-translational sites in the query protein is returned to the user. We also present a study of the human protein kinase C family as a biological application of our method.  相似文献   

13.
A trans-cleaving asymmetric hammerhead ribozyme directed against an AUC decreases target motif within an RNA specific for human immunodeficiency virus type 1 (HIV-1) was generated. The AUC decreases motif of the target RNA was permutated in order to generate all 12 variants of an NUX decreases consensus target motif, wherein N = A, C, G or U and X = A, C or U. Four asymmetric hammerhead ribozymes differing in the nucleotide that is complementary to N were generated, of which each was specific for three of the 12 target motifs. The residual sequence context within helices I and III remained unchanged. All 12 combinations resulted in cleavage of the target RNA. Using single-turnover conditions, the detectable cleavage rate constants at 37 degrees C were determined, which varied considerably depending on the NUX decreases motif. The NUC decreases motifs were cleaved more efficiently, with AUC decreases being cleaved best. Comparison with previous studies indicates that the sequence context of the NUX decreases motif plays a major role for the detectable cleavage activity.  相似文献   

14.
Promoters are DNA sequences located upstream of the gene region and play a central role in gene expression. Computational techniques show good accuracy in gene prediction but are less successful in predicting promoters, primarily because of the high number of false positives that reflect characteristics of the promoter sequences. Many machine learning methods have been used to address this issue. Neural Networks (NN) have been successfully used in this field because of their ability to recognize imprecise and incomplete patterns characteristic of promoter sequences. In this paper, NN was used to predict and recognize promoter sequences in two data sets: (i) one based on nucleotide sequence information and (ii) another based on stability sequence information. The accuracy was approximately 80% for simulation (i) and 68% for simulation (ii). In the rules extracted, biological consensus motifs were important parts of the NN learning process in both simulations.  相似文献   

15.
Post-translational modifications (PTMs) play an essential role in most biological processes. PTMs on human proteins have been extensively studied. Studies on bacterial PTMs are emerging, which demonstrate that bacterial PTMs are different from human PTMs in their types, mechanisms and functions. Few PTM studies have been done on the microbiome. Here, we reviewed several studied PTMs in bacteria including phosphorylation, acetylation, succinylation, glycosylation, and proteases. We discussed the enzymes responsible for each PTM and their functions. We also summarized the current methods used to study microbiome PTMs and the observations demonstrating the roles of PTM in the microbe-microbe interactions within the microbiome and their interactions with the environment or host. Although new methods and tools for PTM studies are still needed, the existing technologies have made great progress enabling a deeper understanding of the functional regulation of the microbiome. Large-scale application of these microbiome-wide PTM studies will provide a better understanding of the microbiome and its roles in the development of human diseases.  相似文献   

16.
Interpreting the impact of human genome variation on phenotype is challenging. The functional effect of protein-coding variants is often predicted using sequence conservation and population frequency data, however other factors are likely relevant. We hypothesized that variants in protein post-translational modification (PTM) sites contribute to phenotype variation and disease. We analyzed fraction of rare variants and non-synonymous to synonymous variant ratio (Ka/Ks) in 7,500 human genomes and found a significant negative selection signal in PTM regions independent of six factors, including conservation, codon usage, and GC-content, that is widely distributed across tissue-specific genes and function classes. PTM regions are also enriched in known disease mutations, suggesting that PTM variation is more likely deleterious. PTM constraint also affects flanking sequence around modified residues and increases around clustered sites, indicating presence of functionally important short linear motifs. Using target site motifs of 124 kinases, we predict that at least ∼180,000 motif-breaker amino acid residues that disrupt PTM sites when substituted, and highlight kinase motifs that show specific negative selection and enrichment of disease mutations. We provide this dataset with corresponding hypothesized mechanisms as a community resource. As an example of our integrative approach, we propose that PTPN11 variants in Noonan syndrome aberrantly activate the protein by disrupting an uncharacterized cluster of phosphorylation sites. Further, as PTMs are molecular switches that are modulated by drugs, we study mutated binding sites of PTM enzymes in disease genes and define a drug-disease network containing 413 novel predicted disease-gene links.  相似文献   

17.
A major challenge in proteomics is to fully identify and characterize the post-translational modification (PTM) patterns present at any given time in cells, tissues, and organisms. Here we present a fast and reliable method ("ModifiComb") for mapping hundreds types of PTMs at a time, including novel and unexpected PTMs. The high mass accuracy of Fourier transform mass spectrometry provides in many cases unique elemental composition of the PTM through the difference DeltaM between the molecular masses of the modified and unmodified peptides, whereas the retention time difference DeltaRT between their elution in reversed-phase liquid chromatography provides an additional dimension for PTM identification. Abundant sequence information obtained with complementary fragmentation techniques using ion-neutral collisions and electron capture often locates the modification to a single residue. The (DeltaM, DeltaRT) maps are representative of the proteome and its overall modification state and may be used for database-independent organism identification, comparative proteomic studies, and biomarker discovery. Examples of newly found modifications include +12.000 Da (+C atom) incorporation into proline residues of peptides from proline-rich proteins found in human saliva. This modification is hypothesized to increase the known activity of the peptide.  相似文献   

18.

Background  

There has been an explosion in the number of single nucleotide polymorphisms (SNPs) within public databases. In this study we focused on non-synonymous protein coding single nucleotide polymorphisms (nsSNPs), some associated with disease and others which are thought to be neutral. We describe the distribution of both types of nsSNPs using structural and sequence based features and assess the relative value of these attributes as predictors of function using machine learning methods. We also address the common problem of balance within machine learning methods and show the effect of imbalance on nsSNP function prediction. We show that nsSNP function prediction can be significantly improved by 100% undersampling of the majority class. The learnt rules were then applied to make predictions of function on all nsSNPs within Ensembl.  相似文献   

19.
Due to Ca2+‐dependent binding and the sequence diversity of Calmodulin (CaM) binding proteins, identifying CaM interactions and binding sites in the wet‐lab is tedious and costly. Therefore, computational methods for this purpose are crucial to the design of such wet‐lab experiments. We present an algorithm suite called CaMELS (CalModulin intEraction Learning System) for predicting proteins that interact with CaM as well as their binding sites using sequence information alone. CaMELS offers state of the art accuracy for both CaM interaction and binding site prediction and can aid biologists in studying CaM binding proteins. For CaM interaction prediction, CaMELS uses protein sequence features coupled with a large‐margin classifier. CaMELS models the binding site prediction problem using multiple instance machine learning with a custom optimization algorithm which allows more effective learning over imprecisely annotated CaM‐binding sites during training. CaMELS has been extensively benchmarked using a variety of data sets, mutagenic studies, proteome‐wide Gene Ontology enrichment analyses and protein structures. Our experiments indicate that CaMELS outperforms simple motif‐based search and other existing methods for interaction and binding site prediction. We have also found that the whole sequence of a protein, rather than just its binding site, is important for predicting its interaction with CaM. Using the machine learning model in CaMELS, we have identified important features of protein sequences for CaM interaction prediction as well as characteristic amino acid sub‐sequences and their relative position for identifying CaM binding sites. Python code for training and evaluating CaMELS together with a webserver implementation is available at the URL: http://faculty.pieas.edu.pk/fayyaz/software.html#camels .  相似文献   

20.
Protein nitration and nitrosylation are essential post-translational modifications(PTMs)involved in many fundamental cellular processes. Recent studies have revealed that excessive levels of nitration and nitrosylation in some critical proteins are linked to numerous chronic diseases.Therefore, the identification of substrates that undergo such modifications in a site-specific manner is an important research topic in the community and will provide candidates for targeted therapy. In this study, we aimed to develop a computational tool for predicting nitration and nitrosylation sites in proteins. We first constructed four types of encoding features, including positional amino acid distributions, sequence contextual dependencies, physicochemical properties, and position-specificscoring features, to represent the modified residues. Based on these encoding features, we established a predictor called DeepNitro using deep learning methods for predicting protein nitration and nitrosylation. Using n-fold cross-validation, our evaluation shows great AUC values for DeepNitro, 0.65 for tyrosine nitration, 0.80 for tryptophan nitration, and 0.70 for cysteine nitrosylation, respectively,demonstrating the robustness and reliability of our tool. Also, when tested in the independent dataset, DeepNitro is substantially superior to other similar tools with a 7%à42% improvement in the prediction performance. Taken together, the application of deep learning method and novel encoding schemes, especially the position-specific scoring feature, greatly improves the accuracy of nitration and nitrosylation site prediction and may facilitate the prediction of other PTM sites. DeepNitro is implemented in JAVA and PHP and is freely available for academic research at http://deepnitro.renlab.org.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号