首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We investigated the evolution of transmembrane (TM) topology by detecting partial sequence repeats in TM protein sequences and analyzing them in detail. A total of 377 sequences that seem to have evolved by internal gene duplication events were found among 38,124 predicted TM protein sequences (except for single-spannings) from 87 prokaryotic genomes. Various types of internal duplication patterns were identified in these sequences. The majority of them are diploid-type (including quasi-diploid-type) duplication in which a primordial protein sequence was duplicated internally to become an extant TM protein with twice as many TM segments as the primordial one, and the remaining ones are partial duplications including triploid-type. The diploid-type repeats are recognized in many 8-tms, 10-tms and 12-tms TM protein sequences, suggesting the diploid-type duplication was a principle mechanism in the evolutionary development of these types of TM proteins. The "positive-inside" rule is satisfied in whole sequences of both 10-tms and 8-tms TM proteins and in both halves of 10-tms proteins while not necessarily in the second half of 8-tms proteins, providing fit examples of "internal divergent topology evolution" likely occurred after a diploid-type internal duplication event. From analyzing the partial duplication patterns, several evolutionary pathways were recognized for 6-tms TM proteins, i.e. from primordial 2-tms, 3-tms and 4-tms TM proteins to extant 6-tms proteins. Similarly, the duplication pattern analysis revealed plausible evolution scenarios that 7-tms TM proteins have arisen from 3-tms, 4-tms and 5-tms TM protein precursors via partial internal gene duplications.  相似文献   

2.
We propose a novel method for identifying and classifying the functions of transmembrane (TM) proteins based on their TM topology [the number of TM segments (tms), the loop length and the N-terminus location]. In this method, the TM topology is expressed as a string of '0' and '1', and this is designated the binary topology pattern (BTP). We focused on TM proteins with up to 12 tms, with the exception of 1 and 9 tms, and classified them into 37 functional groups by the number of tms and the functional annotation. These grouped TM protein sequences were used to determine BTPs which are specific to the individual functional groups. Since the evaluated accuracies (sensitivity, specificity and self-consistency) of these patterns in functional identification were quite high overall, i.e. 0.940, 0.934 and 0.935, respectively, as averaged over the 37 functional groups, we confirmed that TM protein function can be identified by the number of tms and the characteristics of loop lengths, i.e. BTPs.  相似文献   

3.
Hydrophobicity analyses applied to databases of soluble and transmembrane (TM) proteins of known structure were used to resolve total genomic hydrophobicity profiles into (helical) TM sequences and mainly "subhydrophobic" soluble components. This information was used to define a refined "hydrophobicity"-type TM sequence prediction scale that should approach the theoretical limit of accuracy. The refinement procedure involved adjusting scale values to eliminate differences between the average amino acid composition of populations TM and soluble sequences of equal hydrophobicity, a required property of a scale having maximum accuracy. Application of this procedure to different hydrophobicity scales caused them to collapse to essentially a single TM tendency scale. As expected, when different scales were compared, the TM tendency scale was the most accurate at predicting TM sequences. It was especially highly correlated (r = 0.95) to the biological hydrophobicity scale, derived experimentally from the percent TM conformation formed by artificial sequences passing though the translocon. It was also found that resolution of total genomic sequence data into TM and soluble components could be used to define the percent probability that a sequence with a specific hydrophobicity value forms a TM segment. Application of the TM tendency scale to whole genomic data revealed an overlap of TM and soluble sequences in the "semihydrophobic" range. This raises the possibility that a significant number of proteins have sequences that can switch between TM and non-TM states. Such proteins may exist in moonlighting forms having properties very different from those of the predominant conformation.  相似文献   

4.
A combined transmembrane topology and signal peptide prediction method   总被引:31,自引:0,他引:31  
An inherent problem in transmembrane protein topology prediction and signal peptide prediction is the high similarity between the hydrophobic regions of a transmembrane helix and that of a signal peptide, leading to cross-reaction between the two types of predictions. To improve predictions further, it is therefore important to make a predictor that aims to discriminate between the two classes. In addition, topology information can be gained when successfully predicting a signal peptide leading a transmembrane protein since it dictates that the N terminus of the mature protein must be on the non-cytoplasmic side of the membrane. Here, we present Phobius, a combined transmembrane protein topology and signal peptide predictor. The predictor is based on a hidden Markov model (HMM) that models the different sequence regions of a signal peptide and the different regions of a transmembrane protein in a series of interconnected states. Training was done on a newly assembled and curated dataset. Compared to TMHMM and SignalP, errors coming from cross-prediction between transmembrane segments and signal peptides were reduced substantially by Phobius. False classifications of signal peptides were reduced from 26.1% to 3.9% and false classifications of transmembrane helices were reduced from 19.0% to 7.7%. Phobius was applied to the proteomes of Homo sapiens and Escherichia coli. Here we also noted a drastic reduction of false classifications compared to TMHMM/SignalP, suggesting that Phobius is well suited for whole-genome annotation of signal peptides and transmembrane regions. The method is available at as well as at  相似文献   

5.
Schaadt NS  Helms V 《Biopolymers》2012,97(7):558-567
Membrane transporters catalyze the transport of small solute molecules across biological barriers such as lipid bilayer membranes. As the experimental annotation of which proteins transport which substrates is incomplete it is highly desirable to develop computational methods that can assist in the classification and substrate annotation of putative membrane transport proteins. Here, we determined the similarity of membrane transporter sequences annotated in the Transport Classification Database (Saier et al., Nucleic Acids Res 2006, 34, D181-D186) and Arabidopsis thaliana membrane transporters annotated in the database Aramemnon (Schwacke et al., Plant Physiol 2003, 131, 16-26). The similarity measure was based on the amino acid composition either considering the full sequences or separately in the transmembrane (TM) and external parts of the sequences. We considered four different substrate sets and three different subfamilies and tried to classify the given proteins into these classes. Family or substrate prediction based on the simple amino acid frequency had an average accuracy of 76%. The differentiation between TM and non-TM regions led to an improved accuracy of 80% on average.  相似文献   

6.
SUMMARY: In eukaryotes, membranous proteins account for 20-30% of the proteome. Most of these proteins contain one or more transmembrane (TM) domains. These are short segments that transverse the bilayer lipid membrane. Various properties of the TM domains, such as their number, their topology and their arrangement within the membrane, are closely related to the protein's cellular functions. The properties of the TM domains also determine the cellular targeting and localization of these proteins. It is not known, however, whether the information encoded by TM domains suffices for the purpose of classifying proteins into their functional families. This is the question we address here. We introduce an algorithm that creates a profile of each functional family of membranous proteins based only on the amino acid composition of their TM domains. This is complemented by a classifier program for each such family (to determine whether a given protein belongs to it or not). We find that in most instances TM domains contain enough information to allow an accurate discrimination of approximately 80% sensitivity and approximately 90% specificity among unrelated polytopic functional families with the same number of TM domains. SUPPLEMENTARY INFORMATION: Available at www.protonet.cs.huji.ac.il/TM/  相似文献   

7.
How recycling receptors are segregated from down-regulated receptors in the endosome is unknown. In previous studies, we demonstrated that substitutions in the transferrin receptor (TR) transmembrane domain (TM) convert the protein from an efficiently recycling receptor to one that is rapidly down regulated. In this study, we demonstrate that the "signal" within the TM necessary and sufficient for down-regulation is Thr(11)Gln(17)Thr(19) (numbering in TM). Transplantation of these polar residues into the wild-type TR promotes receptor down-regulation that can be demonstrated by changes in protein half-life and in receptor recycling. Surprisingly, this modification dramatically increases the TR internalization rate as well ( approximately 79% increase). Sucrose gradient centrifugation and cross-linking studies reveal that propensity of the receptors to self-associate correlates with down-regulation. Interestingly, a number of cell surface proteins that contain TM polar residues are known to be efficiently down-regulated, whereas recycling receptors for low-density lipoprotein and transferrin conspicuously lack these residues. Our data, therefore, suggest a simple model in which specific residues within the TM sequences dramatically influence the fate of membrane proteins after endocytosis, providing an alternative signal for down-regulation of receptor complexes to the well-characterized cytoplasmic tail targeting signals.  相似文献   

8.
A key question associated with topology predictions for membrane proteins is whether there is sufficient variation in the biophysical properties of residues at the membrane interface to enable identification of TM spans in a robust and efficient manner using relatively simple methods of analysis. Here, a test for the homogeneity of multinomial populations is used to identify statistical differences between the residue compositions of windows within datasets of aligned non-homologous TM α-helices. Using this approach, the accuracy and robustness of the predicted boundaries for datasets of uncleaved signal (US) sequences and stop transfer sequences (ST) is tested. The validity of the 21 residue length, which is generally assumed for TM spans in membrane protein topology prediction is also investigated and it is suggested that ST sequences may be better represented by a length of 22 residues.  相似文献   

9.
Transmembrane helix (TMH) topology prediction is becoming a focal problem in bioinformatics because the structure of TM proteins is difficult to determine using experimental methods. Therefore, methods that can computationally predict the topology of helical membrane proteins are highly desirable. In this paper we introduce TMHindex, a method for detecting TMH segments using only the amino acid sequence information. Each amino acid in a protein sequence is represented by a Compositional Index, which is deduced from a combination of the difference in amino acid occurrences in TMH and non-TMH segments in training protein sequences and the amino acid composition information. Furthermore, a genetic algorithm was employed to find the optimal threshold value for the separation of TMH segments from non-TMH segments. The method successfully predicted 376 out of the 378 TMH segments in a dataset consisting of 70 test protein sequences. The sensitivity and specificity for classifying each amino acid in every protein sequence in the dataset was 0.901 and 0.865, respectively. To assess the generality of TMHindex, we also tested the approach on another standard 73-protein 3D helix dataset. TMHindex correctly predicted 91.8% of proteins based on TM segments. The level of the accuracy achieved using TMHindex in comparison to other recent approaches for predicting the topology of TM proteins is a strong argument in favor of our proposed method. Availability: The datasets, software together with supplementary materials are available at: http://faculty.uaeu.ac.ae/nzaki/TMHindex.htm.  相似文献   

10.
Oxidative crosslinking of cysteines introduced by site-specific mutagenesis is a powerful tool for structural analysis of proteins, but the approach has been limited to studies in vitro. We recently reported that intact cells of Escherichia coli could be treated with Cu(II)-(o-phenanthroline)3 or molecular iodine in a way that left unperturbed flagellar function or general chemotactic response, yet crosslinks were quantitatively formed between select cysteines in adjoining transmembrane helices of chemoreceptor Trg. This suggested that oxidative crosslinking might be utilized for structural analysis in vivo. Thus, we used our comprehensive collection of Trg derivatives, each containing a single cysteine at one of the 54 positions in the two transmembrane segments of the receptor monomer to characterize patterns of crosslinking in vivo and in vitro for this homodimeric protein. We found that in vivo crosslinking compared favorably as a technique for structural analysis with the more conventional in vitro approach. Patterns of crosslinking generated by oxidation treatments of intact cells indicated extensive interaction of transmembrane segment 1 (TM1) with its homologous partner (TM1') in the other subunit and a more distant placement of TM2 and TM2', the same relationships identified by crosslinking in isolated membranes. In addition, the same helical faces for TM1-TM1' interaction and TM2-TM2' orientation were identified in vivo and in vitro. The correspondence of the patterns also indicates that structural features identified by analysis of in vitro crosslinking are relevant to the organization of the chemoreceptor in its native environment, the intact, functional cell. It appears that the different features of the two functionally benign treatments used for in vivo oxidations can provide insights into protein dynamics.  相似文献   

11.
Reported performance of existing transmembrane (TM) topology prediction methods were often based on evaluations which neglected the risk of signal peptides (SP) being predicted as putative TM as well. Here, we evaluated 12 selected TM topology prediction methods (TMpred, TopPred II, DAS, TMAP, MEMSAT 2, SOSUI, PRED-TMR2, TMHMM 2.0, HMMTOP 2.0, SPLIT 3.5, TM Finder, and MPEx) for the effect of SP in prediction performance considering three SP treatments, namely: "remain" (untreated), "removed first", and "removed later". The results showed that the presence of SP significantly affected the prediction performance of the 12 selected TM topology prediction methods for all three predicted attributes (the number of transmembrane segments (TMSs), the number of TMSs plus position, and the N-tail location) and for the predicted topology (combined predictions of three attributes) by causing a reduction in prediction accuracy. In particular, lower prediction accuracies were obtained if SP is left untreated (remain) while significant increases were observed if SP is removed either first or later. However, between "removed first" and "removed later" SP treatments, the difference was statistically insignificant. In addition, we found that machine learning-based prediction methods were less affected by the presence of SP than hydropathy-based methods, but still the potential risk of degrading the prediction performance is there however to a lesser degree. Thus, when performing genome-wide analysis, the SP issue should be addressed during TM topology prediction.  相似文献   

12.
Polytopic protein topology is established in the endoplasmic reticulum (ER) by sequence determinants encoded throughout the nascent polypeptide. Here we characterize 12 topogenic determinants in the cystic fibrosis transmembrane conductance regulator, and identify a novel mechanism by which a charged residue is positioned within the plane of the lipid bilayer. During cystic fibrosis transmembrane conductance regulator biogenesis, topology of the C-terminal transmembrane domain (TMs 7-12) is directed by alternating signal (TMs 7, 9, and 11) and stop transfer (TMs 8, 10, and 12) sequences. Unlike conventional stop transfer sequences, however, TM8 is unable to independently terminate translocation due to the presence of a single charged residue, Asp(924), within the TM segment. Instead, TM8 stop transfer activity is specifically dependent on TM7, which functions both to initiate translocation and to compensate for the charged residue within TM8. Moreover, even in the presence of TM7, the N terminus of TM8 extends significantly into the ER lumen, suggesting a high degree of flexibility in establishing TM8 transmembrane boundaries. These studies demonstrate that signal sequences can markedly influence stop transfer behavior and indicate that ER translocation machinery simultaneously integrates information from multiple topogenic determinants as they are presented in rapid succession during polytopic protein biogenesis.  相似文献   

13.
Membrane topology refers to the two-dimensional structural information of a membrane protein that indicates the number of transmembrane (TM) segments and the orientation of soluble domains relative to the plane of the membrane. Since membrane proteins are co-translationally translocated across and inserted into the membrane, the TM segments orient themselves properly in an early stage of membrane protein biogenesis. Each membrane protein must contain some topogenic signals, but the translocation components and the membrane environment also influence the membrane topology of proteins. We discuss the factors that affect membrane protein orientation and have listed available experimental tools that can be used in determining membrane protein topology.  相似文献   

14.
Park Y  Helms V 《Proteins》2006,64(4):895-905
The transmembrane (TM) domains of most membrane proteins consist of helix bundles. The seemingly simple task of TM helix bundle assembly has turned out to be extremely difficult. This is true even for simple TM helix bundle proteins, i.e., those that have the simple form of compact TM helix bundles. Herein, we present a computational method that is capable of generating native-like structural models for simple TM helix bundle proteins having modest numbers of TM helices based on sequence conservation patterns. Thus, the only requirement for our method is the presence of more than 30 homologous sequences for an accurate extraction of sequence conservation patterns. The prediction method first computes a number of representative well-packed conformations for each pair of contacting TM helices, and then a library of tertiary folds is generated by overlaying overlapping TM helices of the representative conformations. This library is scored using sequence conservation patterns, and a subsequent clustering analysis yields five final models. Assuming that neighboring TM helices in the sequence contact each other (but not that TM helices A and G contact each other), the method produced structural models of Calpha atom root-mean-square deviation (CA RMSD) of 3-5 A from corresponding crystal structures for bacteriorhodopsin, halorhodopsin, sensory rhodopsin II, and rhodopsin. In blind predictions, this type of contact knowledge is not available. Mimicking this, predictions were made for the rotor of the V-type Na(+)-adenosine triphosphatase without such knowledge. The CA RMSD between the best model and its crystal structure is only 3.4 A, and its contact accuracy reaches 55%. Furthermore, the model correctly identifies the binding pocket for sodium ion. These results demonstrate that the method can be readily applied to ab initio structure prediction of simple TM helix bundle proteins having modest numbers of TM helices.  相似文献   

15.
The structural genomics projects have been accumulating an increasing number of protein structures, many of which remain functionally unknown. In parallel effort to experimental methods, computational methods are expected to make a significant contribution for functional elucidation of such proteins. However, conventional computational methods that transfer functions from homologous proteins do not help much for these uncharacterized protein structures because they do not have apparent structural or sequence similarity with the known proteins. Here, we briefly review two avenues of computational function prediction methods, i.e. structure-based methods and sequence-based methods. The focus is on our recent developments of local structure-based and sequence-based methods, which can effectively extract function information from distantly related proteins. Two structure-based methods, Pocket-Surfer and Patch-Surfer, identify similar known ligand binding sites for pocket regions in a query protein without using global protein fold similarity information. Two sequence-based methods, protein function prediction and extended similarity group, make use of weakly similar sequences that are conventionally discarded in homology based function annotation. Combined together with experimental methods we hope that computational methods will make leading contribution in functional elucidation of the protein structures.  相似文献   

16.
While helical transmembrane (TM) region prediction tools achieve high (>90%) success rates for real integral membrane proteins, they produce a considerable number of false positive hits in sequences of known nontransmembrane queries. We propose a modification of the dense alignment surface (DAS) method that achieves a substantial decrease in the false positive error rate. Essentially, a sequence that includes possible transmembrane regions is compared in a second step with TM segments in a sequence library of documented transmembrane proteins. If the performance of the query sequence against the library of documented TM segment-containing sequences in this test is lower than an empirical threshold, it is classified as a non-transmembrane protein. The probability of false positive prediction for trusted TM region hits is expressed in terms of E-values. The modified DAS method, the DAS-TMfilter algorithm, has an unchanged high sensitivity for TM segments ( approximately 95% detected in a learning set of 128 documented transmembrane proteins). At the same time, the selectivity measured over a non-redundant set of 526 soluble proteins with known 3D structure is approximately 99%, mainly because a large number of falsely predicted single membrane-pass proteins are eliminated by the DAS-TMfilter algorithm.  相似文献   

17.
Hötzel I  Cheevers WP 《Journal of virology》2003,77(21):11578-11587
A sequence similarity between surface envelope glycoprotein (SU) gp135 of the lentiviruses maedi-visna virus and caprine arthritis-encephalitis virus (CAEV) and human immunodeficiency virus type 1 (HIV-1) gp120 has been described. The regions of sequence similarity are in the second and fifth conserved regions of gp120, and the similarity is highest in sequences coinciding with beta-strands 4 to 8 and 25, which are located in the most virion-proximal region of the gp120 inner domain. A subset of this structure, formed by gp120 beta-strands 4, 5, and 25, is conserved in most or all lentiviruses. Because of the orientation of gp120 on the virion, this highly conserved virion-proximal region of the gp120 core may interact with the transmembrane glycoprotein (TM) together with the amino and carboxy termini of full-length gp120. Therefore, interactions between SU and TM of lentiviruses may be structurally related. Here we tested whether the amino acid residues in the putative virion-proximal region of CAEV gp135 comprising putative beta-strands 4, 5, and 25, as well as its amino and carboxy termini, are important for stable interactions with TM. An amino acid change at gp135 position 119 or 521, located in the turn between putative beta-strands 4 and 5 and near beta-strand 25, respectively, specifically disrupted the epitope recognized by monoclonal antibody 29A. Thus, similar to the corresponding gp120 regions, these gp135 residues are located in close proximity to each other in the folded protein, supporting the hypothesis of a structural similarity between the gp120 virion-proximal inner domain and gp135. Amino acid changes in the amino- and carboxy-terminal and putative virion-proximal regions of gp135 increased gp135 shedding from the cell surface, indicating that these gp135 regions are involved in interactions with TM. Our results indicate structural and functional parallels between CAEV gp135 and HIV-1 gp120 that may be more broadly applicable to the SU of other lentiviruses.  相似文献   

18.
Pairs of helices in transmembrane (TM) proteins are often tightly packed. We present a scoring function and a computational methodology for predicting the tertiary fold of a pair of alpha-helices such that its chances of being tightly packed are maximized. Since the number of TM protein structures solved to date is small, it seems unlikely that a reliable scoring function derived statistically from the known set of TM protein structures will be available in the near future. We therefore constructed a scoring function based on the qualitative insights gained in the past two decades from the solved structures of TM and soluble proteins. In brief, we reward the formation of contacts between small amino acid residues such as Gly, Cys, and Ser, that are known to promote dimerization of helices, and penalize the burial of large amino acid residues such as Arg and Trp. As a case study, we show that our method predicts the native structure of the TM homodimer glycophorin A (GpA) to be, in essence, at the global score optimum. In addition, by correlating our results with empirical point mutations on this homodimer, we demonstrate that our method can be a helpful adjunct to mutation analysis. We present a data set of canonical alpha-helices from the solved structures of TM proteins and provide a set of programs for analyzing it (http://ashtoret.tau.ac.il/~sarel). From this data set we derived 11 helix pairs, and conducted searches around their native states as a further test of our method. Approximately 73% of our predictions showed a reasonable fit (RMS deviation <2A) with the native structures compared to the success rate of 8% expected by chance. The search method we employ is less effective for helix pairs that are connected via short loops (<20 amino acid residues), indicating that short loops may play an important role in determining the conformation of alpha-helices in TM proteins.  相似文献   

19.
MLC1 is a membrane protein mainly expressed in astrocytes, and genetic mutations lead to the development of a leukodystrophy, megalencephalic leukoencephalopathy with subcortical cysts disease. Currently, the biochemical properties of the MLC1 protein are largely unknown. In this study, we aimed to characterize the transmembrane (TM) topology and oligomeric nature of the MLC1 protein. Systematic immunofluorescence staining data revealed that the MLC1 protein has eight TM domains and that both the N- and C-terminus face the cytoplasm. We found that MLC1 can be purified as an oligomer and could form a trimeric complex in both detergent micelles and reconstituted proteoliposomes. Additionally, a single-molecule photobleaching experiment showed that MLC1 protein complexes could consist of three MLC1 monomers in the reconstituted proteoliposomes. These results can provide a basis for both the high-resolution structural determination and functional characterization of the MLC1 protein.  相似文献   

20.

Background

Sequence homology considerations widely used to transfer functional annotation to uncharacterized protein sequences require special precautions in the case of non-globular sequence segments including membrane-spanning stretches composed of non-polar residues. Simple, quantitative criteria are desirable for identifying transmembrane helices (TMs) that must be included into or should be excluded from start sequence segments in similarity searches aimed at finding distant homologues.

Results

We found that there are two types of TMs in membrane-associated proteins. On the one hand, there are so-called simple TMs with elevated hydrophobicity, low sequence complexity and extraordinary enrichment in long aliphatic residues. They merely serve as membrane-anchoring device. In contrast, so-called complex TMs have lower hydrophobicity, higher sequence complexity and some functional residues. These TMs have additional roles besides membrane anchoring such as intra-membrane complex formation, ligand binding or a catalytic role. Simple and complex TMs can occur both in single- and multi-membrane-spanning proteins essentially in any type of topology. Whereas simple TMs have the potential to confuse searches for sequence homologues and to generate unrelated hits with seemingly convincing statistical significance, complex TMs contain essential evolutionary information.

Conclusion

For extending the homology concept onto membrane proteins, we provide a necessary quantitative criterion to distinguish simple TMs (and a sufficient criterion for complex TMs) in query sequences prior to their usage in homology searches based on assessment of hydrophobicity and sequence complexity of the TM sequence segments.

Reviewers

This article was reviewed by Shamil Sunyaev, L. Aravind and Arcady Mushegian.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号