首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 656 毫秒
1.
We describe and validate a new membrane protein topology prediction method, TMHMM, based on a hidden Markov model. We present a detailed analysis of TMHMM's performance, and show that it correctly predicts 97-98 % of the transmembrane helices. Additionally, TMHMM can discriminate between soluble and membrane proteins with both specificity and sensitivity better than 99 %, although the accuracy drops when signal peptides are present. This high degree of accuracy allowed us to predict reliably integral membrane proteins in a large collection of genomes. Based on these predictions, we estimate that 20-30 % of all genes in most genomes encode membrane proteins, which is in agreement with previous estimates. We further discovered that proteins with N(in)-C(in) topologies are strongly preferred in all examined organisms, except Caenorhabditis elegans, where the large number of 7TM receptors increases the counts for N(out)-C(in) topologies. We discuss the possible relevance of this finding for our understanding of membrane protein assembly mechanisms. A TMHMM prediction service is available at http://www.cbs.dtu.dk/services/TMHMM/.  相似文献   

2.
MOTIVATION: Knowledge of the transmembrane helical topology can help identify binding sites and infer functions for membrane proteins. However, because membrane proteins are hard to solubilize and purify, only a very small amount of membrane proteins have structure and topology experimentally determined. This has motivated various computational methods for predicting the topology of membrane proteins. RESULTS: We present an improved hidden Markov model, TMMOD, for the identification and topology prediction of transmembrane proteins. Our model uses TMHMM as a prototype, but differs from TMHMM by the architecture of the submodels for loops on both sides of the membrane and also by the model training procedure. In cross-validation experiments using a set of 83 transmembrane proteins with known topology, TMMOD outperformed TMHMM and other existing methods, with an accuracy of 89% for both topology and locations. In another experiment using a separate set of 160 transmembrane proteins, TMMOD had 84% for topology and 89% for locations. When utilized for identifying transmembrane proteins from non-transmembrane proteins, particularly signal peptides, TMMOD has consistently fewer false positives than TMHMM does. Application of TMMOD to a collection of complete genomes shows that the number of predicted membrane proteins accounts for approximately 20-30% of all genes in those genomes, and that the topology where both the N- and C-termini are in the cytoplasm is dominant in these organisms except for Caenorhabditis elegans. AVAILABILITY: http://liao.cis.udel.edu/website/servers/TMMOD/  相似文献   

3.
Transmembrane proteins affect vital cellular functions and pathogenesis, and are a focus of drug design. It is difficult to obtain diffraction quality crystals to study transmembrane protein structure. Computational tools for transmembrane protein topology prediction fill in the gap between the abundance of transmembrane proteins and the scarcity of known membrane protein structures. Their prediction accuracy is still inadequate: TMHMM, the current state-of-the-art method, has less than 52% accuracy in topology prediction on one set of transmembrane proteins of known topology. Based on the observation that there are functional domains that occur preferentially internal or external to the membrane, we have extended the model of TMHMM to incorporate functional domains, using a probabilistic approach originally developed for computational gene finding. Our extension is better than TMHMM in predicting the topology of transmembrane proteins. As prediction of functional domain improves, our system's prediction accuracy will likely improve as well.  相似文献   

4.
MOTIVATION: Many important biological processes such as cell signaling, transport of membrane-impermeable molecules, cell-cell communication, cell recognition and cell adhesion are mediated by membrane proteins. Unfortunately, as these proteins are not water soluble, it is extremely hard to experimentally determine their structure. Therefore, improved methods for predicting the structure of these proteins are vital in biological research. In order to improve transmembrane topology prediction, we evaluate the combined use of both integrated signal peptide prediction and evolutionary information in a single algorithm. RESULTS: A new method (MEMSAT3) for predicting transmembrane protein topology from sequence profiles is described and benchmarked with full cross-validation on a standard data set of 184 transmembrane proteins. The method is found to predict both the correct topology and the locations of transmembrane segments for 80% of the test set. This compares with accuracies of 62-72% for other popular methods on the same benchmark. By using a second neural network specifically to discriminate transmembrane from globular proteins, a very low overall false positive rate (0.5%) can also be achieved in detecting transmembrane proteins. AVAILABILITY: An implementation of the described method is available both as a web server (http://www.psipred.net) and as downloadable source code from http://bioinf.cs.ucl.ac.uk/memsat. Both the server and source code files are free to non-commercial users. Benchmark and training data are also available from http://bioinf.cs.ucl.ac.uk/memsat.  相似文献   

5.
Wavelet change-point prediction of transmembrane proteins   总被引:3,自引:0,他引:3  
MOTIVATION: A non-parametric method, based on a wavelet data-dependent threshold technique for change-point analysis, is applied to predict location and topology of helices in transmembrane proteins. A new propensity scale generated from a transmembrane helix database is proposed. RESULTS: We show that wavelet change-point performs well for smoothing hydropathy and transmembrane profiles generated using different scales. We investigate which wavelet bases and threshold functions are overall most appropriate to detect transmembrane segments. Prediction accuracy is based on the analysis of two data sets used as standard benchmarks for transmembrane prediction algorithms. The analysis of a test set of 83 proteins results in accuracy per segment equal to 98.2%; the analysis of a 48 proteins blind-test set, i.e. containing proteins not used to generate the propensity scales, results in accuracy per segment equal to 97.4%. We believe that this method can also be applied to the detection of boundaries of other patterns such as G + Cisochores and dot-plots. AVAILABILITY: The transmembrane database, TMALN and source code are available upon request from the authors.  相似文献   

6.
Genomics and proteomics have added valuable information to our knowledgebase of the human biological system including the discovery of therapeutic targets and disease biomarkers. However, molecular profiling studies commonly result in the identification of novel proteins of unknown localization. A class of proteins of special interest is membrane proteins, in particular plasma membrane proteins. Despite their biological and medical significance, the 3-dimensional structures of less than 1% of plasma membrane proteins have been determined. In order to aid in identification of membrane proteins, a number of computational methods have been developed. These tools operate by predicting the presence of transmembrane segments. Here, we utilized five topology prediction methods (TMHMM, SOSUI, waveTM, HMMTOP, and TopPred II) in order to estimate the ratio of integral membrane proteins in the human proteome. These methods employ different algorithms and include a newly-developed method (waveTM) that has yet to be tested on a large proteome database. Since these tools are prone for error mainly as a result of falsely predicting signal peptides as transmembrane segments, we have utilized an additional method, SignalP. Based on our analyses, the ratio of human proteins with transmembrane segments is estimated to fall between 15% and 39% with a consensus of 13%. Agreement among the programs is reduced further when both a positive identification of a membrane protein and the number of transmembrane segments per protein are considered. Such a broad range of prediction depends on the selectivity of the individual method in predicting integral membrane proteins. These methods can play a critical role in determining protein structure and, hence, identifying suitable drug targets in humans.  相似文献   

7.
Helices in membrane spanning regions are more tightly packed than the helices in soluble proteins. Thus, we introduce a method that uses a simple scale of burial propensity and a new algorithm to predict transmembrane helical (TMH) segments and a positive-inside rule to predict amino-terminal orientation. The method (the topology predictor of transmembrane helical proteins using mean burial propensity [THUMBUP]) correctly predicted the topology of 55 of 73 proteins (or 75%) with known three-dimensional structures (the 3D helix database). This level of accuracy can be reached by MEMSAT 1.8 (a 200-parameter model-recognition method) and a new HMM-based method (a 111-parameter hidden Markov model, UMDHMM(TMHP)) if they were retrained with the 73-protein database. Thus, a method based on a physiochemical property can provide topology prediction as accurate as those methods based on more complicated statistical models and learning algorithms for the proteins with accurately known structures. Commonly used HMM-based methods and MEMSAT 1.8 were trained with a combination of the partial 3D helix database and a 1D helix database of TMH proteins in which topology information were obtained by gene fusion and other experimental techniques. These methods provide a significantly poorer prediction for the topology of TMH proteins in the 3D helix database. This suggests that the 1D helix database, because of its inaccuracy, should be avoided as either a training or testing database. A Web server of THUMBUP and UMDHMM(TMHP) is established for academic users at http://www.smbs.buffalo.edu/phys_bio/service.htm. The 3D helix database is also available from the same Web site.  相似文献   

8.
OrienTM is a computer software that utilizes an initial definition of transmembrane segments to predict the topology of transmembrane proteins from their sequence. It uses position-specific statistical information for amino acid residues which belong to putative non-transmembrane segments derived from statistical analysis of non-transmembrane regions of membrane proteins stored in the SwissProt database. Its accuracy compares well with that of other popular existing methods. A web-based version of OrienTM is publicly available at the address http://biophysics.biol.uoa.gr/OrienTM.  相似文献   

9.
Secreted protein prediction system combining CJ-SPHMM,TMHMM, and PSORT   总被引:4,自引:0,他引:4  
To increase the coverage of secreted protein prediction, we describe a combination strategy. Instead of using a single method, we combine Hidden Markov Model (HMM)-based methods CJ-SPHMM and TMHMM with PSORT in secreted protein prediction. CJ-SPHMM is an HMM-based signal peptide prediction method, while TMHMM is an HMM-based transmembrane (TM) protein prediction algorithm. With CJ-SPHMM and TMHMM, proteins with predicted signal peptide and without predicted TM regions are taken as putative secreted proteins. This HMM-based approach predicts secreted protein with Ac (Accuracy) at 0.82 and Cc (Correlation coefficient) at 0.75, which are similar to PSORT with Ac at 0.82 and Cc at 0.76. When we further complement the HMM-based method, i.e., CJ-SPHMM + TMHMM with PSORT in secreted protein prediction, the Ac value is increased to 0.86 and the Cc value is increased to 0.81. Taking this combination strategy to search putative secreted proteins from the International Protein Index (IPI) maintained at the European Bioinformatics Institute (EBI), we constructed a putative human secretome with 5235 proteins. The prediction system described here can also be applied to predicting secreted proteins from other vertebrate proteomes. Availability: The CJ-SPHMM and predicted secreted proteins are available at: ftp://ftp.cbi.pku.edu.cn/pub/secreted-protein/  相似文献   

10.
We present a novel method that predicts transmembrane domains in proteins using solely information contained in the sequence itself. The PRED-TMR algorithm described, refines a standard hydrophobicity analysis with a detection of potential termini ('edges', starts and ends) of transmembrane regions. This allows one both to discard highly hydrophobic regions not delimited by clear start and end configurations and to confirm putative transmembrane segments not distinguishable by their hydrophobic composition. The accuracy obtained on a test set of 101 non-homologous transmembrane proteins with reliable topologies compares well with that of other popular existing methods. Only a slight decrease in prediction accuracy was observed when the algorithm was applied to all transmembrane proteins of the SwissProt database (release 35). A WWW server running the PRED-TMR algorithm is available at http://o2.db.uoa. gr/PRED-TMR/  相似文献   

11.
The TOPDOM database is a collection of domains and sequence motifs located consistently on the same side of the membrane in alpha-helical transmembrane proteins. The database was created by scanning well-annotated transmembrane protein sequences in the UniProt database by specific domain or motif detecting algorithms. The identified domains or motifs were added to the database if they were uniformly annotated on the same side of the membrane of the various proteins in the UniProt database. The information about the location of the collected domains and motifs can be incorporated into constrained topology prediction algorithms, like HMMTOP, increasing the prediction accuracy. AVAILABILITY: The TOPDOM database and the constrained HMMTOP prediction server are available on the page http://topdom.enzim.hu CONTACT: tusi@enzim.hu; lkalmar@enzim.hu.  相似文献   

12.
The HMMTOP transmembrane topology prediction server   总被引:22,自引:0,他引:22  
The HMMTOP transmembrane topology prediction server predicts both the localization of helical transmembrane segments and the topology of transmembrane proteins. Recently, several improvements have been introduced to the original method. Now, the user is allowed to submit additional information about segment localization to enhance the prediction power. This option improves the prediction accuracy as well as helps the interpretation of experimental results, i.e. in epitope insertion experiments. Availability: HMMTOP 2.0 is freely available to non-commercial users at http://www.enzim.hu/hmmtop. Source code is also available upon request to academic users.  相似文献   

13.
Hidden Markov models (HMMs) have been successfully applied to the tasks of transmembrane protein topology prediction and signal peptide prediction. In this paper we expand upon this work by making use of the more powerful class of dynamic Bayesian networks (DBNs). Our model, Philius, is inspired by a previously published HMM, Phobius, and combines a signal peptide submodel with a transmembrane submodel. We introduce a two-stage DBN decoder that combines the power of posterior decoding with the grammar constraints of Viterbi-style decoding. Philius also provides protein type, segment, and topology confidence metrics to aid in the interpretation of the predictions. We report a relative improvement of 13% over Phobius in full-topology prediction accuracy on transmembrane proteins, and a sensitivity and specificity of 0.96 in detecting signal peptides. We also show that our confidence metrics correlate well with the observed precision. In addition, we have made predictions on all 6.3 million proteins in the Yeast Resource Center (YRC) database. This large-scale study provides an overall picture of the relative numbers of proteins that include a signal-peptide and/or one or more transmembrane segments as well as a valuable resource for the scientific community. All DBNs are implemented using the Graphical Models Toolkit. Source code for the models described here is available at http://noble.gs.washington.edu/proj/philius. A Philius Web server is available at http://www.yeastrc.org/philius, and the predictions on the YRC database are available at http://www.yeastrc.org/pdr.  相似文献   

14.
Alpha helix transmembrane proteins (αTMPs) represent roughly 30% of all open reading frames (ORFs) in a typical genome and are involved in many critical biological processes. Due to the special physicochemical properties, it is hard to crystallize and obtain high resolution structures experimentally, thus, sequence-based topology prediction is highly desirable for the study of transmembrane proteins (TMPs), both in structure prediction and function prediction. Various model-based topology prediction methods have been developed, but the accuracy of those individual predictors remain poor due to the limitation of the methods or the features they used. Thus, the consensus topology prediction method becomes practical for high accuracy applications by combining the advances of the individual predictors. Here, based on the observation that inter-helical interactions are commonly found within the transmembrane helixes (TMHs) and strongly indicate the existence of them, we present a novel consensus topology prediction method for αTMPs, CNTOP, which incorporates four top leading individual topology predictors, and further improves the prediction accuracy by using the predicted inter-helical interactions. The method achieved 87% prediction accuracy based on a benchmark dataset and 78% accuracy based on a non-redundant dataset which is composed of polytopic αTMPs. Our method derives the highest topology accuracy than any other individual predictors and consensus predictors, at the same time, the TMHs are more accurately predicted in their length and locations, where both the false positives (FPs) and the false negatives (FNs) decreased dramatically. The CNTOP is available at: http://ccst.jlu.edu.cn/JCSB/cntop/CNTOP.html.  相似文献   

15.
The polytopic membrane protein presenilin 1 (PS1) is a component of the -secretase complex that is responsible for the intramembranous cleavage of several type I transmembrane proteins, including the -amyloid precursor protein (APP). Mutations of PS1, apparently leading to aberrant processing of APP, have been genetically linked to early-onset familial Alzheimer’s disease. PS1 contains 10 hydrophobic regions (HRs) sufficiently long to be -helical membrane spanning segments. Most topology models for PS1 place its COOH terminal 40 amino acids, which include HR 10, in the cytosolic space. However, several recent observations suggest that HR 10 may be integrated into the membrane and involved in the interaction between PS1 and APP. We have applied three independent methodologies to investigate the location of HR 10 and the extreme COOH terminus of PS1. The results from these methods indicate that HR 10 spans the membrane and that the COOH terminal amino acids of PS1 lie in the extracytoplasmic space. Alzheimer’s disease; -secretase; -amyloid; intramembranous protease; transmembrane topology  相似文献   

16.
Transmembrane beta barrel (TMB) proteins are found in the outer membranes of bacteria, mitochondria and chloroplasts. TMBs are involved in a variety of functions such as mediating flux of metabolites and active transport of siderophores, enzymes and structural proteins, and in the translocation across or insertion into membranes. We present here TMBHMM, a computational method based on a hidden Markov model for predicting the structural topology of putative TMBs from sequence. In addition to predicting transmembrane strands, TMBHMM also predicts the exposure status (i.e., exposed to the membrane or hidden in the protein structure) of the residues in the transmembrane region, which is a novel feature of the TMBHMM method. Furthermore, TMBHMM can also predict the membrane residues that are not part of beta barrel forming strands. The training of the TMBHMM was performed on a non-redundant data set of 19 TMBs. The self-consistency test yielded Q(2) accuracy of 0.87, Q(3) accuracy of 0.83, Matthews correlation coefficient of 0.74 and SOV for beta strand of 0.95. In this self-consistency test the method predicted 83% of transmembrane residues with correct exposure status. On an unseen, non-redundant test data set of 10 proteins, the 2-state and 3-state TMBHMM prediction accuracies are around 73% and 72%, respectively, and are comparable to other methods from the literature. The TMBHMM web server takes an amino acid sequence or a multiple sequence alignment as an input and predicts the exposure status and the structural topology as output. The TMBHMM web server is available under the tmbhmm tab at: http://service.bioinformatik.uni-saarland.de/tmx-site/.  相似文献   

17.
Prediction of transmembrane (TM) segments of amino acid sequences of membrane proteins is a well-known and very important problem. The accuracy of its solution can be improved for approaches that do not use a homology search in an additional data bank. There is a lack of tested data in this area of research, because information on the structure of membrane proteins is scarce. In this work we created a test sample of structural alignments for membrane proteins. The TM segments of these proteins were mapped according to aligned 3D structures resolved for these proteins. A method for predicting TM segments in an alignment was developed on the basis of the forward-backward algorithm from the HMM theory. This method allows a user not only to predict TM segments, but also to create a probabilistic membrane profile, which can be employed in multiple alignment procedures taking the secondary structure of proteins into account. The method was implemented in a computer program available at http://bioinf.fbb.msu.ru/fwdbck/. It provides better results than the MEMSAT method, which is nearly the only tool predicting TM segments in multiple alignments, without a homology search.  相似文献   

18.
MOTIVATION: The dearth of structural data on alpha-helical membrane proteins (MPs) has hampered thus far the development of reliable knowledge-based potentials that can be used for automatic prediction of transmembrane (TM) protein structure. While algorithms for identifying TM segments are available, modeling of the TM domains of alpha-helical MPs involves assembling the segments into a bundle. This requires the correct assignment of the buried and lipid-exposed faces of the TM domains. RESULTS: A recent increase in the number of crystal structures of alpha-helical MPs has enabled an analysis of the lipid-exposed surfaces and the interiors of such molecules on the basis of structure, rather than sequence alone. Together with a conservation criterion that is based on previous observations that conserved residues are mostly found in the interior of MPs, the bias of certain residue types to be preferably buried or exposed is proposed as a criterion for predicting the lipid-exposed and interior faces of TMs. Applications to known structures demonstrates 80% accuracy of this prediction algorithm. AVAILABILITY: The algorithm used for the predictions is implemented in the ProperTM Web server (http://icb.med.cornell.edu/services/propertm/start).  相似文献   

19.
It has been shown that the progress in the determination of membrane protein structure grows exponentially, with approximately the same growth rate as that of the water-soluble proteins. In order to investigate the effect of this, on the performance of prediction algorithms for both α-helical and β-barrel membrane proteins, we conducted a prospective study based on historical records. We trained separate hidden Markov models with different sized training sets and evaluated their performance on topology pred...  相似文献   

20.
Shen H  Chou JJ 《PloS one》2008,3(6):e2399
Prediction of transmembrane helices (TMH) in alpha helical membrane proteins provides valuable information about the protein topology when the high resolution structures are not available. Many predictors have been developed based on either amino acid hydrophobicity scale or pure statistical approaches. While these predictors perform reasonably well in identifying the number of TMHs in a protein, they are generally inaccurate in predicting the ends of TMHs, or TMHs of unusual length. To improve the accuracy of TMH detection, we developed a machine-learning based predictor, MemBrain, which integrates a number of modern bioinformatics approaches including sequence representation by multiple sequence alignment matrix, the optimized evidence-theoretic K-nearest neighbor prediction algorithm, fusion of multiple prediction window sizes, and classification by dynamic threshold. MemBrain demonstrates an overall improvement of about 20% in prediction accuracy, particularly, in predicting the ends of TMHs and TMHs that are shorter than 15 residues. It also has the capability to detect N-terminal signal peptides. The MemBrain predictor is a useful sequence-based analysis tool for functional and structural characterization of helical membrane proteins; it is freely available at http://chou.med.harvard.edu/bioinf/MemBrain/.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号