首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A combined transmembrane topology and signal peptide prediction method   总被引:31,自引:0,他引:31  
An inherent problem in transmembrane protein topology prediction and signal peptide prediction is the high similarity between the hydrophobic regions of a transmembrane helix and that of a signal peptide, leading to cross-reaction between the two types of predictions. To improve predictions further, it is therefore important to make a predictor that aims to discriminate between the two classes. In addition, topology information can be gained when successfully predicting a signal peptide leading a transmembrane protein since it dictates that the N terminus of the mature protein must be on the non-cytoplasmic side of the membrane. Here, we present Phobius, a combined transmembrane protein topology and signal peptide predictor. The predictor is based on a hidden Markov model (HMM) that models the different sequence regions of a signal peptide and the different regions of a transmembrane protein in a series of interconnected states. Training was done on a newly assembled and curated dataset. Compared to TMHMM and SignalP, errors coming from cross-prediction between transmembrane segments and signal peptides were reduced substantially by Phobius. False classifications of signal peptides were reduced from 26.1% to 3.9% and false classifications of transmembrane helices were reduced from 19.0% to 7.7%. Phobius was applied to the proteomes of Homo sapiens and Escherichia coli. Here we also noted a drastic reduction of false classifications compared to TMHMM/SignalP, suggesting that Phobius is well suited for whole-genome annotation of signal peptides and transmembrane regions. The method is available at as well as at  相似文献   

2.
MOTIVATION: Many important biological processes such as cell signaling, transport of membrane-impermeable molecules, cell-cell communication, cell recognition and cell adhesion are mediated by membrane proteins. Unfortunately, as these proteins are not water soluble, it is extremely hard to experimentally determine their structure. Therefore, improved methods for predicting the structure of these proteins are vital in biological research. In order to improve transmembrane topology prediction, we evaluate the combined use of both integrated signal peptide prediction and evolutionary information in a single algorithm. RESULTS: A new method (MEMSAT3) for predicting transmembrane protein topology from sequence profiles is described and benchmarked with full cross-validation on a standard data set of 184 transmembrane proteins. The method is found to predict both the correct topology and the locations of transmembrane segments for 80% of the test set. This compares with accuracies of 62-72% for other popular methods on the same benchmark. By using a second neural network specifically to discriminate transmembrane from globular proteins, a very low overall false positive rate (0.5%) can also be achieved in detecting transmembrane proteins. AVAILABILITY: An implementation of the described method is available both as a web server (http://www.psipred.net) and as downloadable source code from http://bioinf.cs.ucl.ac.uk/memsat. Both the server and source code files are free to non-commercial users. Benchmark and training data are also available from http://bioinf.cs.ucl.ac.uk/memsat.  相似文献   

3.
4.
We describe and validate a new membrane protein topology prediction method, TMHMM, based on a hidden Markov model. We present a detailed analysis of TMHMM's performance, and show that it correctly predicts 97-98 % of the transmembrane helices. Additionally, TMHMM can discriminate between soluble and membrane proteins with both specificity and sensitivity better than 99 %, although the accuracy drops when signal peptides are present. This high degree of accuracy allowed us to predict reliably integral membrane proteins in a large collection of genomes. Based on these predictions, we estimate that 20-30 % of all genes in most genomes encode membrane proteins, which is in agreement with previous estimates. We further discovered that proteins with N(in)-C(in) topologies are strongly preferred in all examined organisms, except Caenorhabditis elegans, where the large number of 7TM receptors increases the counts for N(out)-C(in) topologies. We discuss the possible relevance of this finding for our understanding of membrane protein assembly mechanisms. A TMHMM prediction service is available at http://www.cbs.dtu.dk/services/TMHMM/.  相似文献   

5.
Protein structure prediction is a cornerstone of bioinformatics research. Membrane proteins require their own prediction methods due to their intrinsically different composition. A variety of tools exist for topology prediction of membrane proteins, many of them available on the Internet. The server described in this paper, BPROMPT (Bayesian PRediction Of Membrane Protein Topology), uses a Bayesian Belief Network to combine the results of other prediction methods, providing a more accurate consensus prediction. Topology predictions with accuracies of 70% for prokaryotes and 53% for eukaryotes were achieved. BPROMPT can be accessed at http://www.jenner.ac.uk/BPROMPT.  相似文献   

6.
Predicting transmembrane beta-barrels in proteomes   总被引:1,自引:0,他引:1  
Very few methods address the problem of predicting beta-barrel membrane proteins directly from sequence. One reason is that only very few high-resolution structures for transmembrane beta-barrel (TMB) proteins have been determined thus far. Here we introduced the design, statistics and results of a novel profile-based hidden Markov model for the prediction and discrimination of TMBs. The method carefully attempts to avoid over-fitting the sparse experimental data. While our model training and scoring procedures were very similar to a recently published work, the architecture and structure-based labelling were significantly different. In particular, we introduced a new definition of beta- hairpin motifs, explicit state modelling of transmembrane strands, and a log-odds whole-protein discrimination score. The resulting method reached an overall four-state (up-, down-strand, periplasmic-, outer-loop) accuracy as high as 86%. Furthermore, accurately discriminated TMB from non-TMB proteins (45% coverage at 100% accuracy). This high precision enabled the application to 72 entirely sequenced Gram-negative bacteria. We found over 164 previously uncharacterized TMB proteins at high confidence. Database searches did not implicate any of these proteins with membranes. We challenge that the vast majority of our 164 predictions will eventually be verified experimentally. All proteome predictions and the PROFtmb prediction method are available at http://www.rostlab.org/ services/PROFtmb/.  相似文献   

7.
The HMMTOP transmembrane topology prediction server   总被引:22,自引:0,他引:22  
The HMMTOP transmembrane topology prediction server predicts both the localization of helical transmembrane segments and the topology of transmembrane proteins. Recently, several improvements have been introduced to the original method. Now, the user is allowed to submit additional information about segment localization to enhance the prediction power. This option improves the prediction accuracy as well as helps the interpretation of experimental results, i.e. in epitope insertion experiments. Availability: HMMTOP 2.0 is freely available to non-commercial users at http://www.enzim.hu/hmmtop. Source code is also available upon request to academic users.  相似文献   

8.
During the last two decades a large number of computational methods have been developed for predicting transmembrane protein topology. Current predictors rely on topogenic signals in the protein sequence, such as the distribution of positively charged residues in extra-membrane loops and the existence of N-terminal signals. However, phosphorylation and glycosylation are post-translational modifications (PTMs) that occur in a compartment-specific manner and therefore the presence of a phosphorylation or glycosylation site in a transmembrane protein provides topological information. We examine the combination of phosphorylation and glycosylation site prediction with transmembrane protein topology prediction. We report the development of a Hidden Markov Model based method, capable of predicting the topology of transmembrane proteins and the existence of kinase specific phosphorylation and N/O-linked glycosylation sites along the protein sequence. Our method integrates a novel feature in transmembrane protein topology prediction, which results in improved performance for topology prediction and reliable prediction of phosphorylation and glycosylation sites. The method is freely available at http://bioinformatics.biol.uoa.gr/HMMpTM.  相似文献   

9.
BACKGROUND: Mixture model on graphs (MMG) is a probabilistic model that integrates network topology with (gene, protein) expression data to predict the regulation state of genes and proteins. It is remarkably robust to missing data, a feature particularly important for its use in quantitative proteomics. A new implementation in C and interfaced with R makes MMG extremely fast and easy to use and to extend. AVAILABILITY: The original implementation (Matlab) is still available from http://www.dcs.shef.ac.uk/~guido/; the new implementation is available from http://wrightlab.group.shef.ac.uk/people_noirel.htm, from CRAN, and has been submitted to BioConductor, http://www.bioconductor.org/.  相似文献   

10.
Introduction: Integral membrane proteins and lipids constitute the bilayer membranes that surround cells and sub-cellular compartments, and modulate movements of molecules and information between them. Since membrane protein drug targets represent a disproportionately large segment of the proteome, technical developments need timely review.

Areas covered: Publically available resources such as Pubmed were surveyed. Bottom-up proteomics analyses now allow efficient extraction and digestion such that membrane protein coverage is essentially complete, making up around one third of the proteome. However, this coverage relies upon hydrophilic loop regions while transmembrane domains are generally poorly covered in peptide-based strategies. Top-down mass spectrometry where the intact membrane protein is fragmented in the gas phase gives good coverage in transmembrane regions, and membrane fractions are yielding to high-throughput top-down proteomics. Exciting progress in native mass spectrometry of membrane protein complexes is providing insights into subunit stoichiometry and lipid binding, and cross-linking strategies are contributing critical in-vivo information.

Expert commentary: It is clear from the literature that integral membrane proteins have yielded to advanced techniques in protein chemistry and mass spectrometry, with applications limited only by the imagination of investigators. Key advances toward translation to the clinic are emphasized.  相似文献   


11.
Membrane proteins perform a variety of functions, all crucially dependent on their orientation in the membrane. However, neither the exact number of transmembrane domains (TMDs) nor the topology of most proteins have been experimentally determined. Due to this, most scientists rely primarily on prediction algorithms to determine topology and TMD assignments. Since these can give contradictory results, single‐algorithm‐based predictions are unreliable. To map the extent of potential misanalysis, the predictions of nine algorithms on the yeast proteome are compared and it is found that they have little agreement when predicting TMD number and termini orientation. To view all predictions in parallel, a webpage called TopologYeast: http://www.weizmann.ac.il/molgen/TopologYeast was created. Each algorithm is compared with experimental data and a poor agreement is found. The analysis suggests that more systematic data on protein topology are required to increase the training sets for prediction algorithms and to have accurate knowledge of membrane protein topology.  相似文献   

12.
The complex hydrophobic and hydrophilic milieus of membrane-associated proteins pose experimental and theoretical challenges to their understanding. Here, we produce a nonredundant database to compute knowledge-based asymmetric cross-membrane potentials from the per-residue distributions of C(β), C(γ) and functional group atoms. We predict transmembrane and peripherally associated regions from genomic sequence and position peptides and protein structures relative to the bilayer (available at http://www.degradolab.org/ez). The pseudo-energy topological landscapes underscore positional stability and functional mechanisms demonstrated here for antimicrobial peptides, transmembrane proteins, and viral fusion proteins. Moreover, experimental effects of point mutations on the relative ratio changes of dual-topology proteins are quantitatively reproduced. The functional group potential and the membrane-exposed residues display the largest energetic changes enabling to detect native-like structures from decoys. Hence, focusing on the uniqueness of membrane-associated proteins and peptides, we quantitatively parameterize their cross-membrane propensity, thus facilitating structural refinement, characterization, prediction, and design.  相似文献   

13.
Introduction: Many lines of evidence indicate that low levels of HDL cholesterol increase the risk of cardiovascular disease (CVD). However, recent clinical studies of statin-treated subjects with established atherosclerosis cast doubt on the hypothesis that elevating HDL cholesterol levels reduces CVD risk.

Areas covered: It is critical to identify new HDL metrics that capture HDL’s proposed cardioprotective effects. One promising approach is quantitative MS/MS-based HDL proteomics. This article focuses on recent studies of the feasibility and challenges of using this strategy in translational studies. It also discusses how lipid-lowering therapy and renal disease alter HDL’s functions and proteome, and how HDL might serve as a platform for binding proteins with specific functional properties.

Expert commentary: It is clear that HDL has a diverse protein cargo and that its functions extend well beyond its classic role in lipid transport and reverse cholesterol transport. MS/MS analysis has demonstrated that HDL might contain >80 different proteins. Key challenges are demonstrating that these proteins truly associate with HDL, are functionally important, and that MS-based HDL proteomics can reproducibly detect biomarkers in translational studies of disease risk.  相似文献   


14.
MOTIVATION: Knowledge of the transmembrane helical topology can help identify binding sites and infer functions for membrane proteins. However, because membrane proteins are hard to solubilize and purify, only a very small amount of membrane proteins have structure and topology experimentally determined. This has motivated various computational methods for predicting the topology of membrane proteins. RESULTS: We present an improved hidden Markov model, TMMOD, for the identification and topology prediction of transmembrane proteins. Our model uses TMHMM as a prototype, but differs from TMHMM by the architecture of the submodels for loops on both sides of the membrane and also by the model training procedure. In cross-validation experiments using a set of 83 transmembrane proteins with known topology, TMMOD outperformed TMHMM and other existing methods, with an accuracy of 89% for both topology and locations. In another experiment using a separate set of 160 transmembrane proteins, TMMOD had 84% for topology and 89% for locations. When utilized for identifying transmembrane proteins from non-transmembrane proteins, particularly signal peptides, TMMOD has consistently fewer false positives than TMHMM does. Application of TMMOD to a collection of complete genomes shows that the number of predicted membrane proteins accounts for approximately 20-30% of all genes in those genomes, and that the topology where both the N- and C-termini are in the cytoplasm is dominant in these organisms except for Caenorhabditis elegans. AVAILABILITY: http://liao.cis.udel.edu/website/servers/TMMOD/  相似文献   

15.
MOTIVATION: The experimental difficulties of alpha-helical transmembrane protein structure determination make this class of protein an important target for sequence-based structure prediction tools. The MEMPACK prediction server allows users to submit a transmembrane protein sequence and returns transmembrane topology, lipid exposure, residue contacts, helix-helix interactions and helical packing arrangement predictions in both plain text and graphical formats using a number of novel machine learning-based algorithms. AVAILABILITY: The server can be accessed as a new component of the PSIPRED portal by at http://bioinf.cs.ucl.ac.uk/psipred/.  相似文献   

16.
Objective: Lymph node metastasis leads to high mortality rates of oral squamous cell carcinoma (OSCC). However, it is still controversial to define clinically negative neck (cN0) and positive neck (cN1-3).

Methods: We retrieved candidate biomarkers identified by proteomic analysis in OSCC from published works of literature. In training stage, immunohistochemistry (IHC) analysis was used to determine the expression of proteins and logistic regression models with stepwise variable selection were used to identify potential factors that might affect lymph node metastasis and life status. Furthermore, the prediction model was validated in validating stage.

Results: We screened eight highly expressed proteins related to lymph node metastasis in OSCC and found that the expression levels of SOD2, BST2, CAD, ITGB6, and PRDX4 were significantly elevated in patients with lymph node metastasis compared to the patients without lymph node metastasis. Furthermore, in training and validating stages, the prediction model base on the combination of CAD, SOD2 expression levels, and histopathologic grade was developed and validated in patients with OSCC.

Conclusions: Our findings showed that the developed model well predicts the lymph node metastasis and life status in patients with OSCC, independent of TNM stage.  相似文献   


17.
18.
Introduction: The technological and scientific progress performed in the Human Proteome Project (HPP) has provided to the scientific community a new set of experimental and bioinformatic methods in the challenging field of shotgun and SRM/MRM-based Proteomics. The requirements for a protein to be considered experimentally validated are now well-established, and the information about the human proteome is available in the neXtProt database, while targeted proteomic assays are stored in SRMAtlas. However, the study of the missing proteins continues being an outstanding issue.

Areas covered: This review is focused on the implementation of proteogenomic methods designed to improve the detection and validation of the missing proteins. The evolution of the methodological strategies based on the combination of different omic technologies and the use of huge publicly available datasets is shown taking the Chromosome 16 Consortium as reference.

Expert commentary: Proteogenomics and other strategies of data analysis implemented within the C-HPP initiative could be used as guidance to complete in a near future the catalog of the human proteins. Besides, in the next years, we will probably witness their use in the B/D-HPP initiative to go a step forward on the implications of the proteins in the human biology and disease.  相似文献   


19.

Background

Computational identification of apicoplast-targeted proteins is important in drug target determination for diseases such as malaria. While there are established methods for identifying proteins with a bipartite signal in multiple species of Apicomplexa, not all apicoplast-targeted proteins possess this bipartite signature. The publication of recent experimental findings of apicoplast membrane proteins, called transmembrane proteins, that do not possess a bipartite signal has made it feasible to devise a machine learning approach for identifying this new class of apicoplast-targeted proteins computationally.

Methodology/principal findings

In this work, we develop a method for predicting apicoplast-targeted transmembrane proteins for multiple species of Apicomplexa, whereby several classifiers trained on different feature sets and based on different algorithms are evaluated and combined in an ensemble classification model to obtain the best expected performance. The feature sets considered are the hydrophobicity and composition characteristics of amino acids over transmembrane domains, the existence of short sequence motifs over cytosolically disposed regions, and Gene Ontology (GO) terms associated with given proteins. Our model, ApicoAMP, is an ensemble classification model that combines decisions of classifiers following the majority vote principle. ApicoAMP is trained on a set of proteins from 11 apicomplexan species and achieves 91% overall expected accuracy.

Conclusions/significance

ApicoAMP is the first computational model capable of identifying apicoplast-targeted transmembrane proteins in Apicomplexa. The ApicoAMP prediction software is available at http://code.google.com/p/apicoamp/ and http://bcb.eecs.wsu.edu.  相似文献   

20.
Introduction: Cancer is one of the leading causes of morbidity and mortality worldwide. A hallmark of cancer is evasion of apoptosis leading to tumor progression and drug resistance. Biomarker research has become a sign of the times, and proteins involved in apoptosis may be used for clinical diagnostic or prognostic purposes in cancer treatment. The recent progress in proteomic technology has triggered an emerging number of researchers to study the molecular mechanisms that regulate the apoptotic signal transduction pathways in cancer.

Areas covered: A PubMed search for ‘Proteomics’ and ‘cancer’ and ‘chemotherapy’ and ‘apoptosis’ has been conducted for literature until December 2017.

Results: The study of apoptotic protein signatures in cancer provides valuable information for more effective prognosis, response to therapy and the identification of novel drug targets. A huge number of bioinformatic tools are available to interpret raw data. For quantification, mass spectrometry is the most reliable technique.

Expert commentary: This field of research is, however, still in its infancy and more intensive research is warranted to explore the full potential of biomarkers for clinical use. Progress in this field is influenced by the detection limit of current quantification methods as well as patient and cancer inter-individual profiles.  相似文献   


设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号