共查询到20条相似文献,搜索用时 15 毫秒
1.
Alpha helix transmembrane proteins (αTMPs) represent roughly 30% of all open reading frames (ORFs) in a typical genome and are involved in many critical biological processes. Due to the special physicochemical properties, it is hard to crystallize and obtain high resolution structures experimentally, thus, sequence-based topology prediction is highly desirable for the study of transmembrane proteins (TMPs), both in structure prediction and function prediction. Various model-based topology prediction methods have been developed, but the accuracy of those individual predictors remain poor due to the limitation of the methods or the features they used. Thus, the consensus topology prediction method becomes practical for high accuracy applications by combining the advances of the individual predictors. Here, based on the observation that inter-helical interactions are commonly found within the transmembrane helixes (TMHs) and strongly indicate the existence of them, we present a novel consensus topology prediction method for αTMPs, CNTOP, which incorporates four top leading individual topology predictors, and further improves the prediction accuracy by using the predicted inter-helical interactions. The method achieved 87% prediction accuracy based on a benchmark dataset and 78% accuracy based on a non-redundant dataset which is composed of polytopic αTMPs. Our method derives the highest topology accuracy than any other individual predictors and consensus predictors, at the same time, the TMHs are more accurately predicted in their length and locations, where both the false positives (FPs) and the false negatives (FNs) decreased dramatically. The CNTOP is available at: http://ccst.jlu.edu.cn/JCSB/cntop/CNTOP.html. 相似文献
2.
A combined transmembrane topology and signal peptide prediction method 总被引:31,自引:0,他引:31
An inherent problem in transmembrane protein topology prediction and signal peptide prediction is the high similarity between the hydrophobic regions of a transmembrane helix and that of a signal peptide, leading to cross-reaction between the two types of predictions. To improve predictions further, it is therefore important to make a predictor that aims to discriminate between the two classes. In addition, topology information can be gained when successfully predicting a signal peptide leading a transmembrane protein since it dictates that the N terminus of the mature protein must be on the non-cytoplasmic side of the membrane. Here, we present Phobius, a combined transmembrane protein topology and signal peptide predictor. The predictor is based on a hidden Markov model (HMM) that models the different sequence regions of a signal peptide and the different regions of a transmembrane protein in a series of interconnected states. Training was done on a newly assembled and curated dataset. Compared to TMHMM and SignalP, errors coming from cross-prediction between transmembrane segments and signal peptides were reduced substantially by Phobius. False classifications of signal peptides were reduced from 26.1% to 3.9% and false classifications of transmembrane helices were reduced from 19.0% to 7.7%. Phobius was applied to the proteomes of Homo sapiens and Escherichia coli. Here we also noted a drastic reduction of false classifications compared to TMHMM/SignalP, suggesting that Phobius is well suited for whole-genome annotation of signal peptides and transmembrane regions. The method is available at as well as at 相似文献
3.
The CLN3 gene encodes an integral membrane protein of unknown function. Mutations in CLN3 can cause juvenile neuronal ceroid lipofuscinosis, or Batten disease, an inherited neurodegenerative lysosomal storage disease affecting children. Here, we report a topological study of the CLN3 protein using bioinformatic approaches constrained by experimental data. Our results suggest that CLN3 has a six transmembrane helix topology with cytoplasmic N and C-termini, three large lumenal loops, one of which may contain an amphipathic helix, and one large cytoplasmic loop. Surprisingly, varied topological predictions were made using different subsets of orthologous sequences, highlighting the challenges still remaining for bioinformatics. 相似文献
4.
Georgios N. Tsaousis Pantelis G. Bagos Stavros J. Hamodrakas 《Biochimica et Biophysica Acta - Proteins and Proteomics》2014,1844(2):316-322
During the last two decades a large number of computational methods have been developed for predicting transmembrane protein topology. Current predictors rely on topogenic signals in the protein sequence, such as the distribution of positively charged residues in extra-membrane loops and the existence of N-terminal signals. However, phosphorylation and glycosylation are post-translational modifications (PTMs) that occur in a compartment-specific manner and therefore the presence of a phosphorylation or glycosylation site in a transmembrane protein provides topological information. We examine the combination of phosphorylation and glycosylation site prediction with transmembrane protein topology prediction. We report the development of a Hidden Markov Model based method, capable of predicting the topology of transmembrane proteins and the existence of kinase specific phosphorylation and N/O-linked glycosylation sites along the protein sequence. Our method integrates a novel feature in transmembrane protein topology prediction, which results in improved performance for topology prediction and reliable prediction of phosphorylation and glycosylation sites. The method is freely available at http://bioinformatics.biol.uoa.gr/HMMpTM. 相似文献
5.
Reported performance of existing transmembrane (TM) topology prediction methods were often based on evaluations which neglected the risk of signal peptides (SP) being predicted as putative TM as well. Here, we evaluated 12 selected TM topology prediction methods (TMpred, TopPred II, DAS, TMAP, MEMSAT 2, SOSUI, PRED-TMR2, TMHMM 2.0, HMMTOP 2.0, SPLIT 3.5, TM Finder, and MPEx) for the effect of SP in prediction performance considering three SP treatments, namely: "remain" (untreated), "removed first", and "removed later". The results showed that the presence of SP significantly affected the prediction performance of the 12 selected TM topology prediction methods for all three predicted attributes (the number of transmembrane segments (TMSs), the number of TMSs plus position, and the N-tail location) and for the predicted topology (combined predictions of three attributes) by causing a reduction in prediction accuracy. In particular, lower prediction accuracies were obtained if SP is left untreated (remain) while significant increases were observed if SP is removed either first or later. However, between "removed first" and "removed later" SP treatments, the difference was statistically insignificant. In addition, we found that machine learning-based prediction methods were less affected by the presence of SP than hydropathy-based methods, but still the potential risk of degrading the prediction performance is there however to a lesser degree. Thus, when performing genome-wide analysis, the SP issue should be addressed during TM topology prediction. 相似文献
6.
The HMMTOP transmembrane topology prediction server 总被引:22,自引:0,他引:22
The HMMTOP transmembrane topology prediction server predicts both the localization of helical transmembrane segments and the topology of transmembrane proteins. Recently, several improvements have been introduced to the original method. Now, the user is allowed to submit additional information about segment localization to enhance the prediction power. This option improves the prediction accuracy as well as helps the interpretation of experimental results, i.e. in epitope insertion experiments. Availability: HMMTOP 2.0 is freely available to non-commercial users at http://www.enzim.hu/hmmtop. Source code is also available upon request to academic users. 相似文献
7.
Transmembrane proteins affect vital cellular functions and pathogenesis, and are a focus of drug design. It is difficult to obtain diffraction quality crystals to study transmembrane protein structure. Computational tools for transmembrane protein topology prediction fill in the gap between the abundance of transmembrane proteins and the scarcity of known membrane protein structures. Their prediction accuracy is still inadequate: TMHMM, the current state-of-the-art method, has less than 52% accuracy in topology prediction on one set of transmembrane proteins of known topology. Based on the observation that there are functional domains that occur preferentially internal or external to the membrane, we have extended the model of TMHMM to incorporate functional domains, using a probabilistic approach originally developed for computational gene finding. Our extension is better than TMHMM in predicting the topology of transmembrane proteins. As prediction of functional domain improves, our system's prediction accuracy will likely improve as well. 相似文献
8.
Jones DT 《Bioinformatics (Oxford, England)》2007,23(5):538-544
MOTIVATION: Many important biological processes such as cell signaling, transport of membrane-impermeable molecules, cell-cell communication, cell recognition and cell adhesion are mediated by membrane proteins. Unfortunately, as these proteins are not water soluble, it is extremely hard to experimentally determine their structure. Therefore, improved methods for predicting the structure of these proteins are vital in biological research. In order to improve transmembrane topology prediction, we evaluate the combined use of both integrated signal peptide prediction and evolutionary information in a single algorithm. RESULTS: A new method (MEMSAT3) for predicting transmembrane protein topology from sequence profiles is described and benchmarked with full cross-validation on a standard data set of 184 transmembrane proteins. The method is found to predict both the correct topology and the locations of transmembrane segments for 80% of the test set. This compares with accuracies of 62-72% for other popular methods on the same benchmark. By using a second neural network specifically to discriminate transmembrane from globular proteins, a very low overall false positive rate (0.5%) can also be achieved in detecting transmembrane proteins. AVAILABILITY: An implementation of the described method is available both as a web server (http://www.psipred.net) and as downloadable source code from http://bioinf.cs.ucl.ac.uk/memsat. Both the server and source code files are free to non-commercial users. Benchmark and training data are also available from http://bioinf.cs.ucl.ac.uk/memsat. 相似文献
9.
State-of-the-art methods for topology of α-helical membrane proteins are based on the use of time-consuming multiple sequence alignments obtained from PSI-BLAST or other sources. Here, we examine if it is possible to use the consensus of topology prediction methods that are based on single sequences to obtain a similar accuracy as the more accurate multiple sequence-based methods. Here, we show that TOPCONS-single performs better than any of the other topology prediction methods tested here, but ~6% worse than the best method that is utilizing multiple sequence alignments. AVAILABILITY AND IMPLEMENTATION: TOPCONS-single is available as a web server from http://single.topcons.net/ and is also included for local installation from the web site. In addition, consensus-based topology predictions for the entire international protein index (IPI) is available from the web server and will be updated at regular intervals. 相似文献
10.
An improved hidden Markov model for transmembrane protein detection and topology prediction and its applications to complete genomes 总被引:4,自引:0,他引:4
MOTIVATION: Knowledge of the transmembrane helical topology can help identify binding sites and infer functions for membrane proteins. However, because membrane proteins are hard to solubilize and purify, only a very small amount of membrane proteins have structure and topology experimentally determined. This has motivated various computational methods for predicting the topology of membrane proteins. RESULTS: We present an improved hidden Markov model, TMMOD, for the identification and topology prediction of transmembrane proteins. Our model uses TMHMM as a prototype, but differs from TMHMM by the architecture of the submodels for loops on both sides of the membrane and also by the model training procedure. In cross-validation experiments using a set of 83 transmembrane proteins with known topology, TMMOD outperformed TMHMM and other existing methods, with an accuracy of 89% for both topology and locations. In another experiment using a separate set of 160 transmembrane proteins, TMMOD had 84% for topology and 89% for locations. When utilized for identifying transmembrane proteins from non-transmembrane proteins, particularly signal peptides, TMMOD has consistently fewer false positives than TMHMM does. Application of TMMOD to a collection of complete genomes shows that the number of predicted membrane proteins accounts for approximately 20-30% of all genes in those genomes, and that the topology where both the N- and C-termini are in the cytoplasm is dominant in these organisms except for Caenorhabditis elegans. AVAILABILITY: http://liao.cis.udel.edu/website/servers/TMMOD/ 相似文献
11.
12.
The prediction of transmembrane (TM) helix and topology provides important information about the structure and function of a membrane protein. Due to the experimental difficulties in obtaining a high-resolution model, computational methods are highly desirable. In this paper, we present a hierarchical classification method using support vector machines (SVMs) that integrates selected features by capturing the sequence-to-structure relationship and developing a new scoring function based on membrane protein folding. The proposed approach is evaluated on low- and high-resolution data sets with cross-validation, and the topology (sidedness) prediction accuracy reaches as high as 90%. Our method is also found to correctly predict both the location of TM helices and the topology for 69% of the low-resolution benchmark set. We also test our method for discrimination between soluble and membrane proteins and achieve very low overall false positive (0.5%) and false negative rates (0 to approximately 1.2%). Lastly, the analysis of the scoring function suggests that the topogeneses of single-spanning and multispanning TM proteins have different levels of complexity, and the consideration of interloop topogenic interactions for the latter is the key to achieving better predictions. This method can facilitate the annotation of membrane proteomes to extract useful structural and functional information. It is publicly available at http://bio-cluster.iis.sinica.edu.tw/~bioapp/SVMtop. 相似文献
13.
We have developed reliability scores for five widely used membrane protein topology prediction methods, and have applied them both on a test set of 92 bacterial plasma membrane proteins with experimentally determined topologies and on all predicted helix bundle membrane proteins in three fully sequenced genomes: Escherichia coli, Saccharomyces cerevisiae and Caenorhabditis elegans. We show that the reliability scores work well for the TMHMM and MEMSAT methods, and that they allow the probability that the predicted topology is correct to be estimated for any protein. We further show that the available test set is biased towards high-scoring proteins when compared to the genome-wide data sets, and provide estimates for the expected prediction accuracy of TMHMM across the three genomes. Finally, we show that the performance of TMHMM is considerably better when limited experimental information (such as the in/out location of a protein's C terminus) is available, and estimate that at least ten percentage points in overall accuracy in whole-genome predictions can be gained in this way. 相似文献
14.
We propose a novel method for identifying and classifying the functions of transmembrane (TM) proteins based on their TM topology [the number of TM segments (tms), the loop length and the N-terminus location]. In this method, the TM topology is expressed as a string of '0' and '1', and this is designated the binary topology pattern (BTP). We focused on TM proteins with up to 12 tms, with the exception of 1 and 9 tms, and classified them into 37 functional groups by the number of tms and the functional annotation. These grouped TM protein sequences were used to determine BTPs which are specific to the individual functional groups. Since the evaluated accuracies (sensitivity, specificity and self-consistency) of these patterns in functional identification were quite high overall, i.e. 0.940, 0.934 and 0.935, respectively, as averaged over the 37 functional groups, we confirmed that TM protein function can be identified by the number of tms and the characteristics of loop lengths, i.e. BTPs. 相似文献
15.
Protein structure prediction is a cornerstone of bioinformatics research. Membrane proteins require their own prediction methods due to their intrinsically different composition. A variety of tools exist for topology prediction of membrane proteins, many of them available on the Internet. The server described in this paper, BPROMPT (Bayesian PRediction Of Membrane Protein Topology), uses a Bayesian Belief Network to combine the results of other prediction methods, providing a more accurate consensus prediction. Topology predictions with accuracies of 70% for prokaryotes and 53% for eukaryotes were achieved. BPROMPT can be accessed at http://www.jenner.ac.uk/BPROMPT. 相似文献
16.
OrienTM is a computer software that utilizes an initial definition of transmembrane segments to predict the topology of transmembrane proteins from their sequence. It uses position-specific statistical information for amino acid residues which belong to putative non-transmembrane segments derived from statistical analysis of non-transmembrane regions of membrane proteins stored in the SwissProt database. Its accuracy compares well with that of other popular existing methods. A web-based version of OrienTM is publicly available at the address http://biophysics.biol.uoa.gr/OrienTM. 相似文献
17.
The availability of complete bacterial genome sequences allows proteome-wide predictions of exported proteins that are potentially retained in the cytoplasmic membranes of the corresponding organisms. In practice, however, major problems are encountered with the computer-assisted distinction between (Sec-type) signal peptides that direct exported proteins into the growth medium and lipoprotein signal peptides or amino-terminal membrane anchors that cause protein retention in the membrane. In the present studies, which were aimed at improving methods to predict protein retention in the bacterial cytoplasmic membrane, we have compared sets of membrane-attached and extracellular proteins of Bacillus subtilis that were recently identified through proteomics approaches. The results showed that three classes of membrane-attached proteins can be distinguished. Two classes include 43 lipoproteins and 48 proteins with an amino-terminal transmembrane segment, respectively. Remarkably, a third class includes 31 proteins that remain membrane-retained despite the presence of typical Sec-type signal peptides with consensus signal peptidase recognition sites. This unprecedented finding indicates that unknown mechanisms are involved in membrane retention of this class of proteins. A further novelty is a consensus sequence indicative for release of certain lipoproteins from the membrane by proteolytic shaving. Finally, using non-overlapping sets of secreted and membrane-retained proteins, the accuracy of different signal peptide prediction algorithms was assessed. Accuracy for the prediction of protein retention in the membrane was increased to 82% using a majority-vote approach. Our findings provide important leads for future identification of surface proteins from pathogenic bacteria, which are attractive candidate infection markers and potential targets for drugs or vaccines. 相似文献
18.
HTP: a neural network-based method for predicting the topology of helical transmembrane domains in proteins 总被引:2,自引:0,他引:2
In this paper we describe a microcomputer program (HTP) forpredicting the location and orientation of -helical transmemhranesegments in integral membrane proteins. HTP is a neural network-basedtool which gives as output the protein membrane topology basedon the statistical propensity of residues to be located in externaland internal loops. This method, which uses single protein sequencesas input to the network system, correctly predicts the topologyof 71 out of 92 membrane proteins of putative membrane orientation,independently of the protein source. 相似文献
19.
MOTIVATION: The experimental difficulties of alpha-helical transmembrane protein structure determination make this class of protein an important target for sequence-based structure prediction tools. The MEMPACK prediction server allows users to submit a transmembrane protein sequence and returns transmembrane topology, lipid exposure, residue contacts, helix-helix interactions and helical packing arrangement predictions in both plain text and graphical formats using a number of novel machine learning-based algorithms. AVAILABILITY: The server can be accessed as a new component of the PSIPRED portal by at http://bioinf.cs.ucl.ac.uk/psipred/. 相似文献
20.
Topology predictions for integral membrane proteins can be substantially improved if parts of the protein can be constrained to a given in/out location relative to the membrane using experimental data or other information. Here, we have identified a set of 367 domains in the SMART database that, when found in soluble proteins, have compartment-specific localization of a kind relevant for membrane protein topology prediction. Using these domains as prediction constraints, we are able to provide high-quality topology models for 11% of the membrane proteins extracted from 38 eukaryotic genomes. Two-thirds of these proteins are single spanning, a group of proteins for which current topology prediction methods perform particularly poorly. 相似文献