首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Transmembrane helix (TMH) topology prediction is becoming a focal problem in bioinformatics because the structure of TM proteins is difficult to determine using experimental methods. Therefore, methods that can computationally predict the topology of helical membrane proteins are highly desirable. In this paper we introduce TMHindex, a method for detecting TMH segments using only the amino acid sequence information. Each amino acid in a protein sequence is represented by a Compositional Index, which is deduced from a combination of the difference in amino acid occurrences in TMH and non-TMH segments in training protein sequences and the amino acid composition information. Furthermore, a genetic algorithm was employed to find the optimal threshold value for the separation of TMH segments from non-TMH segments. The method successfully predicted 376 out of the 378 TMH segments in a dataset consisting of 70 test protein sequences. The sensitivity and specificity for classifying each amino acid in every protein sequence in the dataset was 0.901 and 0.865, respectively. To assess the generality of TMHindex, we also tested the approach on another standard 73-protein 3D helix dataset. TMHindex correctly predicted 91.8% of proteins based on TM segments. The level of the accuracy achieved using TMHindex in comparison to other recent approaches for predicting the topology of TM proteins is a strong argument in favor of our proposed method. Availability: The datasets, software together with supplementary materials are available at: http://faculty.uaeu.ac.ae/nzaki/TMHindex.htm.  相似文献   

2.
The prediction of transmembrane (TM) helix and topology provides important information about the structure and function of a membrane protein. Due to the experimental difficulties in obtaining a high-resolution model, computational methods are highly desirable. In this paper, we present a hierarchical classification method using support vector machines (SVMs) that integrates selected features by capturing the sequence-to-structure relationship and developing a new scoring function based on membrane protein folding. The proposed approach is evaluated on low- and high-resolution data sets with cross-validation, and the topology (sidedness) prediction accuracy reaches as high as 90%. Our method is also found to correctly predict both the location of TM helices and the topology for 69% of the low-resolution benchmark set. We also test our method for discrimination between soluble and membrane proteins and achieve very low overall false positive (0.5%) and false negative rates (0 to approximately 1.2%). Lastly, the analysis of the scoring function suggests that the topogeneses of single-spanning and multispanning TM proteins have different levels of complexity, and the consideration of interloop topogenic interactions for the latter is the key to achieving better predictions. This method can facilitate the annotation of membrane proteomes to extract useful structural and functional information. It is publicly available at http://bio-cluster.iis.sinica.edu.tw/~bioapp/SVMtop.  相似文献   

3.
Modeling of integral membrane proteins and the prediction of their functional sites requires the identification of transmembrane (TM) segments and the determination of their angular orientations. Hydrophobicity scales predict accurately the location of TM helices, but are less accurate in computing angular disposition. Estimating lipid-exposure propensities of the residues from statistics of solved membrane protein structures has the disadvantage of relying on relatively few proteins. As an alternative, we propose here a scale of knowledge-based Propensities for Residue Orientation in Transmembrane segments (kPROT), derived from the analysis of more than 5000 non-redundant protein sequences. We assume that residues that tend to be exposed to the membrane are more frequent in TM segments of single-span proteins, while residues that prefer to be buried in the transmembrane bundle interior are present mainly in multi-span TMs. The kPROT value for each residue is thus defined as the logarithm of the ratio of its proportions in single and multiple TM spans. The scale is refined further by defining it for three discrete sections of the TM segment; namely, extracellular, central, and intracellular. The capacity of the kPROT scale to predict angular helical orientation was compared to that of alternative methods in a benchmark test, using a diversity of multi-span alpha-helical transmembrane proteins with a solved 3D structure. kPROT yielded an average angular error of 41 degrees, significantly lower than that of alternative scales (62 degrees -68 degrees ). The new scale thus provides a useful general tool for modeling and prediction of functional residues in membrane proteins. A WWW server (http://bioinfo.weizmann.ac.il/kPROT) is available for automatic helix orientation prediction with kPROT.  相似文献   

4.
TMPDB is a database of experimentally-characterized transmembrane (TM) topologies. TMPDB release 6.2 contains a total of 302 TM protein sequences, in which 276 are alpha-helical sequences, 17 beta-stranded, and 9 alpha-helical sequences with short pore-forming helices buried in the membrane. The TM topologies in TMPDB were determined experimentally by means of X-ray crystallography, NMR, gene fusion technique, substituted cysteine accessibility method, N-linked glycosylation experiment and other biochemical methods. TMPDB would be useful as a test and/or training dataset in improving the proposed TM topology prediction methods or developing novel methods with higher performance, and as a guide for both the bioinformaticians and biologists to better understand TM proteins. TMPDB and its subsets are freely available at the following web site: http://bioinfo.si.hirosaki-u.ac.jp/~TMPDB/.  相似文献   

5.
A key question associated with topology predictions for membrane proteins is whether there is sufficient variation in the biophysical properties of residues at the membrane interface to enable identification of TM spans in a robust and efficient manner using relatively simple methods of analysis. Here, a test for the homogeneity of multinomial populations is used to identify statistical differences between the residue compositions of windows within datasets of aligned non-homologous TM α-helices. Using this approach, the accuracy and robustness of the predicted boundaries for datasets of uncleaved signal (US) sequences and stop transfer sequences (ST) is tested. The validity of the 21 residue length, which is generally assumed for TM spans in membrane protein topology prediction is also investigated and it is suggested that ST sequences may be better represented by a length of 22 residues.  相似文献   

6.
Helices in membrane spanning regions are more tightly packed than the helices in soluble proteins. Thus, we introduce a method that uses a simple scale of burial propensity and a new algorithm to predict transmembrane helical (TMH) segments and a positive-inside rule to predict amino-terminal orientation. The method (the topology predictor of transmembrane helical proteins using mean burial propensity [THUMBUP]) correctly predicted the topology of 55 of 73 proteins (or 75%) with known three-dimensional structures (the 3D helix database). This level of accuracy can be reached by MEMSAT 1.8 (a 200-parameter model-recognition method) and a new HMM-based method (a 111-parameter hidden Markov model, UMDHMM(TMHP)) if they were retrained with the 73-protein database. Thus, a method based on a physiochemical property can provide topology prediction as accurate as those methods based on more complicated statistical models and learning algorithms for the proteins with accurately known structures. Commonly used HMM-based methods and MEMSAT 1.8 were trained with a combination of the partial 3D helix database and a 1D helix database of TMH proteins in which topology information were obtained by gene fusion and other experimental techniques. These methods provide a significantly poorer prediction for the topology of TMH proteins in the 3D helix database. This suggests that the 1D helix database, because of its inaccuracy, should be avoided as either a training or testing database. A Web server of THUMBUP and UMDHMM(TMHP) is established for academic users at http://www.smbs.buffalo.edu/phys_bio/service.htm. The 3D helix database is also available from the same Web site.  相似文献   

7.
Predicting the transmembrane regions is an important aspect of understanding the structures and architecture of different β-barrel membrane proteins. Despite significant efforts, currently available β-transmembrane region predictors are still limited in terms of prediction accuracy, especially in precision. Here, we describe PredβTM, a transmembrane region prediction algorithm for β-barrel proteins. Using amino acid pair frequency information in known β-transmembrane protein sequences, we have trained a support vector machine classifier to predict β-transmembrane segments. Position-specific amino acid preference data is incorporated in the final prediction. The predictor does not incorporate evolutionary profile information explicitly, but is based on sequence patterns generated implicitly by encoding the protein segments using amino acid adjacency matrix. With a benchmark set of 35 β-transmembrane proteins, PredβTM shows a sensitivity and precision of 83.71% and 72.98%, respectively. The segment overlap score is 82.19%. In comparison with other state-of-art methods, PredβTM provides a higher precision and segment overlap without compromising with sensitivity. Further, we applied PredβTM to analyze the β-barrel membrane proteins without defined transmembrane regions and the uncharacterized protein sequences in eight bacterial genomes and predict possible β-transmembrane proteins. PredβTM can be freely accessed on the web at http://transpred.ki.si/.  相似文献   

8.
Haipeng Gong 《Proteins》2017,85(12):2162-2169
Helix‐helix interactions are crucial in the structure assembly, stability and function of helix‐rich proteins including many membrane proteins. In spite of remarkable progresses over the past decades, the accuracy of predicting protein structures from their amino acid sequences is still far from satisfaction. In this work, we focused on a simpler problem, the prediction of helix‐helix interactions, the results of which could facilitate practical protein structure prediction by constraining the sampling space. Specifically, we started from the noisy 2D residue contact maps derived from correlated residue mutations, and utilized ridge detection to identify the characteristic residue contact patterns for helix‐helix interactions. The ridge information as well as a few additional features were then fed into a machine learning model HHConPred to predict interactions between helix pairs. In an independent test, our method achieved an F‐measure of ~60% for predicting helix‐helix interactions. Moreover, although the model was trained mainly using soluble proteins, it could be extended to membrane proteins with at least comparable performance relatively to previous approaches that were generated purely using membrane proteins. All data and source codes are available at http://166.111.152.91/Downloads.html or https://github.com/dpxiong/HHConPred .  相似文献   

9.
Prediction of transmembrane (TM) segments of amino acid sequences of membrane proteins is a well-known and very important problem. The accuracy of its solution can be improved for approaches that do not use a homology search in an additional data bank. There is a lack of tested data in this area of research, because information on the structure of membrane proteins is scarce. In this work we created a test sample of structural alignments for membrane proteins. The TM segments of these proteins were mapped according to aligned 3D structures resolved for these proteins. A method for predicting TM segments in an alignment was developed on the basis of the forward-backward algorithm from the HMM theory. This method allows a user not only to predict TM segments, but also to create a probabilistic membrane profile, which can be employed in multiple alignment procedures taking the secondary structure of proteins into account. The method was implemented in a computer program available at http://bioinf.fbb.msu.ru/fwdbck/. It provides better results than the MEMSAT method, which is nearly the only tool predicting TM segments in multiple alignments, without a homology search.  相似文献   

10.
TMpro is a transmembrane (TM) helix prediction algorithm that uses language processing methodology for TM segment identification. It is primarily based on the analysis of statistical distributions of properties of amino acids in transmembrane segments. This article describes the availability of TMpro on the internet via a web interface. The key features of the interface are: (i) output is generated in multiple formats including a user-interactive graphical chart which allows comparison of TMpro predicted segment locations with other labeled segments input by the user, such as predictions from other methods. (ii) Up to 5000 sequences can be submitted at a time for prediction. (iii) TMpro is available as a web server and is published as a web service so that the method can be accessed by users as well as other services depending on the need for data integration. Availability: http://linzer.blm.cs.cmu.edu/tmpro/ (web server and help), http://blm.sis.pitt.edu:8080/axis/services/TMProFetcherService (web service).  相似文献   

11.
MOTIVATION: Membrane domain prediction has recently been re-evaluated by several groups, suggesting that the accuracy of existing methods is still rather limited. In this work, we revisit this problem and propose novel methods for prediction of alpha-helical as well as beta-sheet transmembrane (TM) domains. The new approach is based on a compact representation of an amino acid residue and its environment, which consists of predicted solvent accessibility and secondary structure of each amino acid. A recently introduced method for solvent accessibility prediction trained on a set of soluble proteins is used here to indicate segments of residues that are predicted not to be accessible to water and, therefore, may be 'buried' in the membrane. While evolutionary profiles in the form of a multiple alignment are used to derive these simple 'structural profiles', they are not used explicitly for the membrane domain prediction and the overall number of parameters in the model is significantly reduced. This offers the possibility of a more reliable estimation of the free parameters in the model with a limited number of experimentally resolved membrane protein structures. RESULTS: Using cross-validated training on available sets of structurally resolved and non-redundant alpha and beta membrane proteins, we demonstrate that membrane domain prediction methods based on such a compact representation outperform approaches that utilize explicitly evolutionary profiles and multiple alignments. Moreover, using an external evaluation by the TMH Benchmark server we show that our final prediction protocol for the TM helix prediction is competitive with the state-of-the-art methods, achieving per-residue accuracy of approximately 89% and per-segment accuracy of approximately 80% on the set of high resolution structures used by the TMH Benchmark server. At the same time the observed rates of confusion with signal peptides and globular proteins are the lowest among the tested methods. The new method is available online at http://minnou.cchmc.org.  相似文献   

12.
The transmembrane topology of the Acr3 family arsenite transporter Acr3 from Bacillus subtilis was analysed experimentally using translational fusions with alkaline phosphatase and green fluorescent protein and in silico by topology modelling. Initial topology prediction resulted in two models with 9 and 10 TM helices respectively. 32 fusion constructs were made between truncated forms of acr3 and the reporter genes at 17 different sites throughout the acr3 sequence to discriminate between these models. Nine strong reporter protein signals provided information about the majority of the locations of the cytoplasmic and extracellular loops of Acr3 and showed that both the N- and the C-termini are located in the cytoplasm. Two ambiguous data points indicated the possibility of an alternative 8 helix topology. This possibility was investigated using another 10 fusion variants, but no experimental support for the 8 TM topology was obtained. We therefore conclude that Acr3 has 10 transmembrane helices. Overall, the loops which connect the membrane spanning segments are short, with cytoplasmic loops being somewhat longer than the extracellular loops. The study provides the first ever experimentally derived structural information on a protein of the Acr3 family which constitutes one of the largest classes of arsenite transporters.  相似文献   

13.
Membrane proteins are a major class of proteins and encoded by approximately 20% to 30% of genes in most organisms. In this work, a two-layer novel membrane protein prediction system, called Mem-PHybrid, is proposed. It is able to first identify the protein query as a membrane or nonmembrane protein. In the second level, it further identifies the type of membrane protein. The proposed Mem-PHybrid prediction system is based on hybrid features, whereby a fusion of both the physicochemical and split amino acid composition-based features is performed. This enables the proposed Mem-PHybrid to exploit the discrimination capabilities of both types of feature extraction strategy. In addition, minimum redundancy and maximum relevance has also been applied to reduce the dimensionality of a feature vector. We employ random forest, evidence-theoretic K-nearest neighbor, and support vector machine (SVM) as classifiers and analyze their performance on two datasets. SVM using hybrid features yields the highest accuracy of 89.6% and 97.3% on dataset1 and 91.5% and 95.5% on dataset2 for jackknife and independent dataset tests, respectively. The enhanced prediction performance of Mem-PHybrid is largely attributed to the exploitation of the discrimination power of the hybrid features and of the learning capability of SVM. Mem-PHybrid is accessible at http://www.111.68.99.218/Mem-PHybrid.  相似文献   

14.
We show that the peptide backbone of an alpha-helix places a severe thermodynamic constraint on transmembrane (TM) stability. Neglect of this constraint by commonly used hydrophobicity scales underlies the notorious uncertainty of TM helix prediction by sliding-window hydropathy plots of membrane protein (MP) amino acid sequences. We find that an experiment-based whole-residue hydropathy scale (WW scale), which includes the backbone constraint, identifies TM helices of membrane proteins with an accuracy greater than 99 %. Furthermore, it correctly predicts the minimum hydrophobicity required for stable single-helix TM insertion observed in Escherichia coli. In order to improve membrane protein topology prediction further, we introduce the augmented WW (aWW) scale, which accounts for the energetics of salt-bridge formation. An important issue for genomic analysis is the ability of the hydropathy plot method to distinguish membrane from soluble proteins. We find that the method falsely predicts 17 to 43 % of a set of soluble proteins to be MPs, depending upon the hydropathy scale used.  相似文献   

15.
A technique for prediction of protein membrane toplogy (intra- and extraceullular sidedness) has been developed. Membrane-spanning segments are first predicted using an algorithm based upon multiply aligned amino acid sequences. The compositional differences in the protein segments exposed at each side of the membrane are then investigated. The ratios are calculated for Asn, Asp, Gly, Phe, Pro, Trp, Tyr, and Val, mostly found on the extracellular side, and for Ala, Arg, Cys, and Lys, mostly occurring on the intracellular side. The consensus over these 12 residue distributions is used for sidedness prediction. The method was developed with a set of 42 protein families for which all but one were correctly predicted with the new algorithm. This represents an improvement over previous techniques. The new method, applied to a set of 12 membrane protein families different from the test set and with recently determined topologies, performed well, with 11 of 12 sidedness assignments agreeing with experimental results. The method has also been applied to several membrane protein families for which the topology has yet to be determined. An electronic prediction service is available at the E-mail address tmap@embl-heidelberg.de and on WWW via http://www.emblheidelberg.de.  相似文献   

16.
Membrane topology refers to the two-dimensional structural information of a membrane protein that indicates the number of transmembrane (TM) segments and the orientation of soluble domains relative to the plane of the membrane. Since membrane proteins are co-translationally translocated across and inserted into the membrane, the TM segments orient themselves properly in an early stage of membrane protein biogenesis. Each membrane protein must contain some topogenic signals, but the translocation components and the membrane environment also influence the membrane topology of proteins. We discuss the factors that affect membrane protein orientation and have listed available experimental tools that can be used in determining membrane protein topology.  相似文献   

17.
The transmembrane topology of the Acr3 family arsenite transporter Acr3 from Bacillus subtilis was analysed experimentally using translational fusions with alkaline phosphatase and green fluorescent protein and in silico by topology modelling. Initial topology prediction resulted in two models with 9 and 10 TM helices respectively. 32 fusion constructs were made between truncated forms of acr3 and the reporter genes at 17 different sites throughout the acr3 sequence to discriminate between these models. Nine strong reporter protein signals provided information about the majority of the locations of the cytoplasmic and extracellular loops of Acr3 and showed that both the N- and the C-termini are located in the cytoplasm. Two ambiguous data points indicated the possibility of an alternative 8 helix topology. This possibility was investigated using another 10 fusion variants, but no experimental support for the 8 TM topology was obtained. We therefore conclude that Acr3 has 10 transmembrane helices. Overall, the loops which connect the membrane spanning segments are short, with cytoplasmic loops being somewhat longer than the extracellular loops. The study provides the first ever experimentally derived structural information on a protein of the Acr3 family which constitutes one of the largest classes of arsenite transporters.  相似文献   

18.
MOTIVATION: Prediction methods are of great importance for membrane proteins as experimental information is harder to obtain than for globular proteins. As more membrane protein structures are solved it is clear that topology information only provides a simplified picture of a membrane protein. Here, we describe a novel challenge for the prediction of alpha-helical membrane proteins: to predict the distance between a residue and the center of the membrane, a measure we define as the Z-coordinate. Even though the traditional way of depicting membrane protein topology is useful, it is advantageous to have a measure that is based on a more "physical" property such as the Z-coordinate, since it implicitly contains information about re-entrant helices, interfacial helices, the tilt of a transmembrane helix and loop lengths. RESULTS: We show that the Z-coordinate can be predicted using either artificial neural networks, hidden Markov models or combinations of both. The best method, ZPRED, uses the output from a hidden Markov model together with a neural network. The average error of ZPRED is 2.55A and 68.6% of the residues are predicted within 3A of the target Z-coordinate in the 5-25A region. ZPRED is also able to predict the maximum protrusion of a loop to within 3A for 78% of the loops in the dataset. AVAILABILITY: Supplementary information and training data is available at http://www.sbc.su.se/~erikgr/.  相似文献   

19.
The YidC/Oxa1/Alb3 family of proteins catalyzes membrane protein insertion in bacteria, mitochondria, and chloroplasts. In this study, we investigated which regions of the bacterial YidC protein are important for its function in membrane protein biogenesis. In Escherichia coli, YidC spans the membrane six times, with a large 319-residue periplasmic domain following the first transmembrane domain. We found that this large periplasmic domain is not required for YidC function and that the residues in the exposed hydrophilic loops or C-terminal tail are not critical for YidC activity. Rather, the five C-terminal transmembrane segments that contain the three consensus sequences in the YidC/Oxa1/Alb3 family are important for its function. However, by systematically replacing all the residues in transmembrane segment (TM) 2, TM3, and TM6 with serine and by swapping TM4 and TM5 with unrelated transmembrane segments, we show that the precise sequence of these transmembrane regions is not essential for in vivo YidC activity. Single serine mutations in TM2, TM3, and TM6 impaired the membrane insertion of the Sec-independent procoat-leader peptidase protein. We propose that the five C-terminal transmembrane segments of YidC function as a platform for the translocating substrate protein to support its insertion into the membrane.  相似文献   

20.
The TOPDOM database is a collection of domains and sequence motifs located consistently on the same side of the membrane in alpha-helical transmembrane proteins. The database was created by scanning well-annotated transmembrane protein sequences in the UniProt database by specific domain or motif detecting algorithms. The identified domains or motifs were added to the database if they were uniformly annotated on the same side of the membrane of the various proteins in the UniProt database. The information about the location of the collected domains and motifs can be incorporated into constrained topology prediction algorithms, like HMMTOP, increasing the prediction accuracy. AVAILABILITY: The TOPDOM database and the constrained HMMTOP prediction server are available on the page http://topdom.enzim.hu CONTACT: tusi@enzim.hu; lkalmar@enzim.hu.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号