首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 434 毫秒
1.
Transmembrane proteins (TMPs) are important drug targets because they are essential for signaling, regulation, and transport. Despite important breakthroughs, experimental structure determination remains challenging for TMPs. Various methods have bridged the gap by predicting transmembrane helices (TMHs), but room for improvement remains. Here, we present TMSEG, a novel method identifying TMPs and accurately predicting their TMHs and their topology. The method combines machine learning with empirical filters. Testing it on a non‐redundant dataset of 41 TMPs and 285 soluble proteins, and applying strict performance measures, TMSEG outperformed the state‐of‐the‐art in our hands. TMSEG correctly distinguished helical TMPs from other proteins with a sensitivity of 98 ± 2% and a false positive rate as low as 3 ± 1%. Individual TMHs were predicted with a precision of 87 ± 3% and recall of 84 ± 3%. Furthermore, in 63 ± 6% of helical TMPs the placement of all TMHs and their inside/outside topology was correctly predicted. There are two main features that distinguish TMSEG from other methods. First, the errors in finding all helical TMPs in an organism are significantly reduced. For example, in human this leads to 200 and 1600 fewer misclassifications compared to the second and third best method available, and 4400 fewer mistakes than by a simple hydrophobicity‐based method. Second, TMSEG provides an add‐on improvement for any existing method to benefit from. Proteins 2016; 84:1706–1716. © 2016 Wiley Periodicals, Inc.  相似文献   

2.
Shen H  Chou JJ 《PloS one》2008,3(6):e2399
Prediction of transmembrane helices (TMH) in alpha helical membrane proteins provides valuable information about the protein topology when the high resolution structures are not available. Many predictors have been developed based on either amino acid hydrophobicity scale or pure statistical approaches. While these predictors perform reasonably well in identifying the number of TMHs in a protein, they are generally inaccurate in predicting the ends of TMHs, or TMHs of unusual length. To improve the accuracy of TMH detection, we developed a machine-learning based predictor, MemBrain, which integrates a number of modern bioinformatics approaches including sequence representation by multiple sequence alignment matrix, the optimized evidence-theoretic K-nearest neighbor prediction algorithm, fusion of multiple prediction window sizes, and classification by dynamic threshold. MemBrain demonstrates an overall improvement of about 20% in prediction accuracy, particularly, in predicting the ends of TMHs and TMHs that are shorter than 15 residues. It also has the capability to detect N-terminal signal peptides. The MemBrain predictor is a useful sequence-based analysis tool for functional and structural characterization of helical membrane proteins; it is freely available at http://chou.med.harvard.edu/bioinf/MemBrain/.  相似文献   

3.

Background  

Prediction of the transmembrane strands and topology of β-barrel outer membrane proteins is of interest in current bioinformatics research. Several methods have been applied so far for this task, utilizing different algorithmic techniques and a number of freely available predictors exist. The methods can be grossly divided to those based on Hidden Markov Models (HMMs), on Neural Networks (NNs) and on Support Vector Machines (SVMs). In this work, we compare the different available methods for topology prediction of β-barrel outer membrane proteins. We evaluate their performance on a non-redundant dataset of 20 β-barrel outer membrane proteins of gram-negative bacteria, with structures known at atomic resolution. Also, we describe, for the first time, an effective way to combine the individual predictors, at will, to a single consensus prediction method.  相似文献   

4.

Background

Although Transmembrane Proteins (TMPs) are highly important in various biological processes and pharmaceutical developments, general prediction of TMP structures is still far from satisfactory. Because TMPs have significantly different physicochemical properties from soluble proteins, current protein structure prediction tools for soluble proteins may not work well for TMPs. With the increasing number of experimental TMP structures available, template-based methods have the potential to become broadly applicable for TMP structure prediction. However, the current fold recognition methods for TMPs are not as well developed as they are for soluble proteins.

Methodology

We developed a novel TMP Fold Recognition method, TMFR, to recognize TMP folds based on sequence-to-structure pairwise alignment. The method utilizes topology-based features in alignment together with sequence profile and solvent accessibility. It also incorporates a gap penalty that depends on predicted topology structure segments. Given the difference between α-helical transmembrane protein (αTMP) and β-strands transmembrane protein (βTMP), parameters of scoring functions are trained respectively for these two protein categories using 58 αTMPs and 17 βTMPs in a non-redundant training dataset.

Results

We compared our method with HHalign, a leading alignment tool using a non-redundant testing dataset including 72 αTMPs and 30 βTMPs. Our method achieved 10% and 9% better accuracies than HHalign in αTMPs and βTMPs, respectively. The raw score generated by TMFR is negatively correlated with the structure similarity between the target and the template, which indicates its effectiveness for fold recognition. The result demonstrates TMFR provides an effective TMP-specific fold recognition and alignment method.  相似文献   

5.
The increasing protein sequences from the genome project require theoretical methods to predict transmembrane helical segments (TMHs). So far, several prediction methods have been reported, but there are some deficiencies in prediction accuracy and adaptability in these methods. In this paper, a method based on discrete wavelet transform (DWT) has been developed to predict the number and location of TMHs in membrane proteins. PDB coded as 1KQG is chosen as an example to describe the prediction process by this method. 80 proteins with known 3D structure from Mptopo database are chosen at random as data sets (including 325 TMHs) and 80 sequences are divided into 13 groups according to their function and type. TMHs prediction is carried out for each group of membrane protein sequences and obtain satisfactory result. To verify the feasibility of this method, 80 membrane protein sequences are treated as test sets, 308 TMHs can be predicted and the prediction accuracy is 96.3%. Compared with the main prediction results of seven popular prediction methods, the obtained results indicate that the proposed method in this paper has higher prediction accuracy.  相似文献   

6.
Eukaryotic transmembrane helical (TMH) proteins perform a wide diversity of critical cellular functions, but remain structurally largely uncharacterized and their high-resolution structure prediction is currently hindered by the lack of close structural homologues. To address this problem, we present a novel and generic method for accurately modeling large TMH protein structures from distant homologues exhibiting distinct loop and TMH conformations. Models of the adenosine A2AR and chemokine CXCR4 receptors were first ranked in GPCR-DOCK blind prediction contests in the receptor structure accuracy category. In a benchmark of 50 TMH protein homolog pairs of diverse topology (from 5 to 12 TMHs), size (from 183 to 420 residues) and sequence identity (from 15% to 70%), the method improves most starting templates, and achieves near-atomic accuracy prediction of membrane-embedded regions. Unlike starting templates, the models are of suitable quality for computer-based protein engineering: redesigned models and redesigned X-ray structures exhibit very similar native interactions. The method should prove useful for the atom-level modeling and design of a large fraction of structurally uncharacterized TMH proteins from a wide range of structural homologues.  相似文献   

7.
现有蛋白质亚细胞定位方法针对水溶性蛋白质而设计,对跨膜蛋白并不适用。而专门的跨膜拓扑预测器,又不是为亚细胞定位而设计的。文章改进了跨膜拓扑预测器TMPHMMLoc的模型结构,设计了一个新的二阶隐马尔可夫模型;采用推广到二阶模型的Baum-Welch算法估计模型参数,并把将各个亚细胞位置建立的模型整合为一个预测器。数据集上测试结果表明,此方法性能显著优于针对可溶性蛋白设计的支持向量机方法和模糊k最邻近方法,也优于TMPHMMLoc中提出的隐马尔可夫模型方法,是一个有效的跨膜蛋白亚细胞定位预测方法。  相似文献   

8.
Since membranous proteins play a key role in drug targeting therefore transmembrane proteins prediction is active and challenging area of biological sciences. Location based prediction of transmembrane proteins are significant for functional annotation of protein sequences. Hidden markov model based method was widely applied for transmembrane topology prediction. Here we have presented a revised and a better understanding model than an existing one for transmembrane protein prediction. Scripting on MATLAB was built and compiled for parameter estimation of model and applied this model on amino acid sequence to know the transmembrane and its adjacent locations. Estimated model of transmembrane topology was based on TMHMM model architecture. Only 7 super states are defined in the given dataset, which were converted to 96 states on the basis of their length in sequence. Accuracy of the prediction of model was observed about 74 %, is a good enough in the area of transmembrane topology prediction. Therefore we have concluded the hidden markov model plays crucial role in transmembrane helices prediction on MATLAB platform and it could also be useful for drug discovery strategy. AVAILABILITY: The database is available for free at bioinfonavneet@gmail.comvinaysingh@bhu.ac.in.  相似文献   

9.
Previously, we introduced a neural network system predicting locations of transmembrane helices (HTMs) based on evolutionary profiles (PHDhtm, Rost B, Casadio R, Fariselli P, Sander C, 1995, Protein Sci 4:521-533). Here, we describe an improvement and an extension of that system. The improvement is achieved by a dynamic programming-like algorithm that optimizes helices compatible with the neural network output. The extension is the prediction of topology (orientation of first loop region with respect to membrane) by applying to the refined prediction the observation that positively charged residues are more abundant in extra-cytoplasmic regions. Furthermore, we introduce a method to reduce the number of false positives, i.e., proteins falsely predicted with membrane helices. The evaluation of prediction accuracy is based on a cross-validation and a double-blind test set (in total 131 proteins). The final method appears to be more accurate than other methods published: (1) For almost 89% (+/-3%) of the test proteins, all HTMs are predicted correctly. (2) For more than 86% (+/-3%) of the proteins, topology is predicted correctly. (3) We define reliability indices that correlate with prediction accuracy: for one half of the proteins, segment accuracy raises to 98%; and for two-thirds, accuracy of topology prediction is 95%. (4) The rate of proteins for which HTMs are predicted falsely is below 2% (+/-1%). Finally, the method is applied to 1,616 sequences of Haemophilus influenzae. We predict 19% of the genome sequences to contain one or more HTMs. This appears to be lower than what we predicted previously for the yeast VIII chromosome (about 25%).  相似文献   

10.
One major problem with the existing algorithm for the prediction of protein structural classes is low accuracies for proteins from α/β and α+β classes. In this study, three novel features were rationally designed to model the differences between proteins from these two classes. In combination with other rational designed features, an 11-dimensional vector prediction method was proposed. By means of this method, the overall prediction accuracy based on 25PDB dataset was 1.5% higher than the previous best-performing method, MODAS. Furthermore, the prediction accuracy for proteins from α+β class based on 25PDB dataset was 5% higher than the previous best-performing method, SCPRED. The prediction accuracies obtained with the D675 and FC699 datasets were also improved.  相似文献   

11.
The prediction of transmembrane (TM) helix and topology provides important information about the structure and function of a membrane protein. Due to the experimental difficulties in obtaining a high-resolution model, computational methods are highly desirable. In this paper, we present a hierarchical classification method using support vector machines (SVMs) that integrates selected features by capturing the sequence-to-structure relationship and developing a new scoring function based on membrane protein folding. The proposed approach is evaluated on low- and high-resolution data sets with cross-validation, and the topology (sidedness) prediction accuracy reaches as high as 90%. Our method is also found to correctly predict both the location of TM helices and the topology for 69% of the low-resolution benchmark set. We also test our method for discrimination between soluble and membrane proteins and achieve very low overall false positive (0.5%) and false negative rates (0 to approximately 1.2%). Lastly, the analysis of the scoring function suggests that the topogeneses of single-spanning and multispanning TM proteins have different levels of complexity, and the consideration of interloop topogenic interactions for the latter is the key to achieving better predictions. This method can facilitate the annotation of membrane proteomes to extract useful structural and functional information. It is publicly available at http://bio-cluster.iis.sinica.edu.tw/~bioapp/SVMtop.  相似文献   

12.
A family of structurally related intrinsic membrane proteins (facilitative glucose transporters) catalyzes the movement of glucose across the plasma membrane of animal cells. Evidence indicates that these proteins show a common structural motif where approximately 50% of the mass is embedded in lipid bilayer (transmembrane domain) in 12 alpha-helices (transmembrane helices; TMHs) and accommodates a water-filled channel for substrate passage (glucose channel) whose tertiary structure is currently unknown. Using recent advances in protein structure prediction algorithms we proposed here two three-dimensional structural models for the transmembrane glucose channel of GLUT1 glucose transporter. Our models emphasize the physical dimension and water accessibility of the channel, loop lengths between TMHs, the macrodipole orientation in four-helix bundle motif, and helix packing energy. Our models predict that five TMHs, either TMHs 3, 4, 7, 8, 11 (Model 1) or TMHs 2, 5, 11, 8, 7 (Model 2), line the channel, and the remaining TMHs surround these channel-lining TMHs. We discuss how our models are compatible with the experimental data obtained with this protein, and how they can be used in designing new biochemical and molecular biological experiments in elucidation of the structural basis of this important protein function.  相似文献   

13.
During the last two decades a large number of computational methods have been developed for predicting transmembrane protein topology. Current predictors rely on topogenic signals in the protein sequence, such as the distribution of positively charged residues in extra-membrane loops and the existence of N-terminal signals. However, phosphorylation and glycosylation are post-translational modifications (PTMs) that occur in a compartment-specific manner and therefore the presence of a phosphorylation or glycosylation site in a transmembrane protein provides topological information. We examine the combination of phosphorylation and glycosylation site prediction with transmembrane protein topology prediction. We report the development of a Hidden Markov Model based method, capable of predicting the topology of transmembrane proteins and the existence of kinase specific phosphorylation and N/O-linked glycosylation sites along the protein sequence. Our method integrates a novel feature in transmembrane protein topology prediction, which results in improved performance for topology prediction and reliable prediction of phosphorylation and glycosylation sites. The method is freely available at http://bioinformatics.biol.uoa.gr/HMMpTM.  相似文献   

14.
Signal peptides and transmembrane helices both contain a stretch of hydrophobic amino acids. This common feature makes it difficult for signal peptide and transmembrane helix predictors to correctly assign identity to stretches of hydrophobic residues near the N-terminal methionine of a protein sequence. The inability to reliably distinguish between N-terminal transmembrane helix and signal peptide is an error with serious consequences for the prediction of protein secretory status or transmembrane topology. In this study, we report a new method for differentiating protein N-terminal signal peptides and transmembrane helices. Based on the sequence features extracted from hydrophobic regions (amino acid frequency, hydrophobicity, and the start position), we set up discriminant functions and examined them on non-redundant datasets with jackknife tests. This method can incorporate other signal peptide prediction methods and achieve higher prediction accuracy. For Gram-negative bacterial proteins, 95.7% of N-terminal signal peptides and transmembrane helices can be correctly predicted (coefficient 0.90). Given a sensitivity of 90%, transmembrane helices can be identified from signal peptides with a precision of 99% (coefficient 0.92). For eukaryotic proteins, 94.2% of N-terminal signal peptides and transmembrane helices can be correctly predicted with coefficient 0.83. Given a sensitivity of 90%, transmembrane helices can be identified from signal peptides with a precision of 87% (coefficient 0.85). The method can be used to complement current transmembrane protein prediction and signal peptide prediction methods to improve their prediction accuracies.  相似文献   

15.
State-of-the-art methods for topology of α-helical membrane proteins are based on the use of time-consuming multiple sequence alignments obtained from PSI-BLAST or other sources. Here, we examine if it is possible to use the consensus of topology prediction methods that are based on single sequences to obtain a similar accuracy as the more accurate multiple sequence-based methods. Here, we show that TOPCONS-single performs better than any of the other topology prediction methods tested here, but ~6% worse than the best method that is utilizing multiple sequence alignments. AVAILABILITY AND IMPLEMENTATION: TOPCONS-single is available as a web server from http://single.topcons.net/ and is also included for local installation from the web site. In addition, consensus-based topology predictions for the entire international protein index (IPI) is available from the web server and will be updated at regular intervals.  相似文献   

16.
Membrane proteins, which constitute approximately 20% of most genomes, form two main classes: alpha helical and beta barrel transmembrane proteins. Using methods based on Bayesian Networks, a powerful approach for statistical inference, we have sought to address beta-barrel topology prediction. The beta-barrel topology predictor reports individual strand accuracies of 88.6%. The method outlined here represents a potentially important advance in the computational determination of membrane protein topology.  相似文献   

17.
Prediction of transmembrane spans and secondary structure from the protein sequence is generally the first step in the structural characterization of (membrane) proteins. Preference of a stretch of amino acids in a protein to form secondary structure and being placed in the membrane are correlated. Nevertheless, current methods predict either secondary structure or individual transmembrane states. We introduce a method that simultaneously predicts the secondary structure and transmembrane spans from the protein sequence. This approach not only eliminates the necessity to create a consensus prediction from possibly contradicting outputs of several predictors but bears the potential to predict conformational switches, i.e., sequence regions that have a high probability to change for example from a coil conformation in solution to an α‐helical transmembrane state. An artificial neural network was trained on databases of 177 membrane proteins and 6048 soluble proteins. The output is a 3 × 3 dimensional probability matrix for each residue in the sequence that combines three secondary structure types (helix, strand, coil) and three environment types (membrane core, interface, solution). The prediction accuracies are 70.3% for nine possible states, 73.2% for three‐state secondary structure prediction, and 94.8% for three‐state transmembrane span prediction. These accuracies are comparable to state‐of‐the‐art predictors of secondary structure (e.g., Psipred) or transmembrane placement (e.g., OCTOPUS). The method is available as web server and for download at www.meilerlab.org . Proteins 2013; 81:1127–1140. © 2013 Wiley Periodicals, Inc.  相似文献   

18.
Co-evolving residues in membrane proteins   总被引:2,自引:0,他引:2  
MOTIVATION: The analysis of co-evolving residues has been exhaustively evaluated for the prediction of intramolecular amino acid contacts in soluble proteins. Although a variety of different methods for the detection of these co-evolving residues have been developed, the fraction of correctly predicted contacts remained insufficient for their reliable application in the construction of structural models. Membrane proteins, which constitute between one-fourth and one-third of all proteins in an organism, were only considered in few individual case studies. RESULTS: We present the first general study of correlated mutations in alpha-helical membrane proteins. Using seven different prediction algorithms, we extracted co-evolving residues for 14 membrane proteins having a solved 3D structure. On average, distances between correlated pairs of residues lying on different transmembrane segments were found to be significantly smaller compared to a random prediction. Covariation of residues was frequently found in direct sequence neighborhood to helix-helix contacts. Based on the results obtained from individual prediction methods, we constructed a consensus prediction for every protein in the dataset that combines obtained correlations from different prediction algorithms and simultaneously removes likely false positives. Using this consensus prediction, 53% of all predicted residue pairs were found within one helix turn of an observed helix-helix contact. Based on the combination of co-evolving residues detected with the four best prediction algorithms, interacting helices could be predicted with a specificity of 83% and sensitivity of 42%. AVAILABILITY: http://webclu.bio.wzw.tum.de/helixcorr/  相似文献   

19.
Fuchs A  Kirschner A  Frishman D 《Proteins》2009,74(4):857-871
Despite rapidly increasing numbers of available 3D structures, membrane proteins still account for less than 1% of all structures in the Protein Data Bank. Recent high-resolution structures indicate a clearly broader structural diversity of membrane proteins than initially anticipated, motivating the development of reliable structure prediction methods specifically tailored for this class of molecules. One important prediction target capturing all major aspects of a protein's 3D structure is its contact map. Our analysis shows that computational methods trained to predict residue contacts in globular proteins perform poorly when applied to membrane proteins. We have recently published a method to identify interacting alpha-helices in membrane proteins based on the analysis of coevolving residues in predicted transmembrane regions. Here, we present a substantially improved algorithm for the same problem, which uses a newly developed neural network approach to predict helix-helix contacts. In addition to the input features commonly used for contact prediction of soluble proteins, such as windowed residue profiles and residue distance in the sequence, our network also incorporates features that apply to membrane proteins only, such as residue position within the transmembrane segment and its orientation toward the lipophilic environment. The obtained neural network can predict contacts between residues in transmembrane segments with nearly 26% accuracy. It is therefore the first published contact predictor developed specifically for membrane proteins performing with equal accuracy to state-of-the-art contact predictors available for soluble proteins. The predicted helix-helix contacts were employed in a second step to identify interacting helices. For our dataset consisting of 62 membrane proteins of solved structure, we gained an accuracy of 78.1%. Because the reliable prediction of helix interaction patterns is an important step in the classification and prediction of membrane protein folds, our method will be a helpful tool in compiling a structural census of membrane proteins.  相似文献   

20.
While early structural models of helix-bundle integral membrane proteins posited that the transmembrane α-helices [transmembrane helices (TMHs)] were orientated more or less perpendicular to the membrane plane, there is now ample evidence from high-resolution structures that many TMHs have significant tilt angles relative to the membrane. Here, we address the question whether the tilt is an intrinsic property of the TMH in question or if it is imparted on the TMH during folding of the protein. Using a glycosylation mapping technique, we show that four highly tilted helices found in multi-spanning membrane proteins all have much shorter membrane-embedded segments when inserted by themselves into the membrane than seen in the high-resolution structures. This suggests that tilting can be induced by tertiary packing interactions within the protein, subsequent to the initial membrane-insertion step.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号