首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 234 毫秒
1.
Web-based servers implementing the DAS-TMfilter algorithm have been launched at three mirror sites and their usage is described. The underlying computer program is an upgraded and modified version of the DAS-prediction method. The new server is (approximately 1 among 100 unrelated queries) while the high efficiency of the original algorithm locating TM segments in queries is preserved (sensitivity of approximately 95% among documented proteins with helical TM regions). AVAILABILITY: The server operates at three mirror sites: http://mendel.imp.univie.ac.at/sat/DAS/DAS.html, http://wooster.bip.bham.ac.uk/DAS.html and http://www.enzim.hu/DAS/DAS.html. The program is available on request.  相似文献   

2.
Predicting the transmembrane regions is an important aspect of understanding the structures and architecture of different β-barrel membrane proteins. Despite significant efforts, currently available β-transmembrane region predictors are still limited in terms of prediction accuracy, especially in precision. Here, we describe PredβTM, a transmembrane region prediction algorithm for β-barrel proteins. Using amino acid pair frequency information in known β-transmembrane protein sequences, we have trained a support vector machine classifier to predict β-transmembrane segments. Position-specific amino acid preference data is incorporated in the final prediction. The predictor does not incorporate evolutionary profile information explicitly, but is based on sequence patterns generated implicitly by encoding the protein segments using amino acid adjacency matrix. With a benchmark set of 35 β-transmembrane proteins, PredβTM shows a sensitivity and precision of 83.71% and 72.98%, respectively. The segment overlap score is 82.19%. In comparison with other state-of-art methods, PredβTM provides a higher precision and segment overlap without compromising with sensitivity. Further, we applied PredβTM to analyze the β-barrel membrane proteins without defined transmembrane regions and the uncharacterized protein sequences in eight bacterial genomes and predict possible β-transmembrane proteins. PredβTM can be freely accessed on the web at http://transpred.ki.si/.  相似文献   

3.
Myristoylation by the myristoyl-CoA:protein N-myristoyltransferase (NMT) is an important lipid anchor modification of eukaryotic and viral proteins. Automated prediction of N-terminal N-myristoylation from the substrate protein sequence alone is necessary for large-scale sequence annotation projects but it requires a low rate of false positive hits in addition to a sufficient sensitivity.Our previous analysis of substrate protein sequence variability, NMT sequences and 3D structures has revealed motif properties in addition to the known PROSITE motif that are utilized in a new predictor described here. The composite prediction function (with separate ad hoc parameterization (a) for queries from non-fungal eukaryotes and their viruses and (b) for sequences from fungal species) consists of terms evaluating amino acid type preferences at sequences positions close to the N terminus as well as terms penalizing deviations from the physical property pattern of amino acid side-chains encoded in multi-residue correlation within the motif sequence. The algorithm has been validated with a self-consistency and two jack-knife tests for the learning set as well as with kinetic data for model substrates. The sensitivity in recognizing documented NMT substrates is above 95 % for both taxon-specific versions. The corresponding rate of false positive prediction (for sequences with an N-terminal glycine residue) is close to 0.5 %; thus, the technique is applicable for large-scale automated sequence database annotation. The predictor is available as public WWW-server with the URL http://mendel.imp.univie.ac.at/myristate/. Additionally, we propose a version of the predictor that identifies a number of proteolytic protein processing sites at internal glycine residues and that evaluates possible N-terminal myristoylation of the protein fragments.A scan of public protein databases revealed new potential NMT targets for which the myristoyl modification may be of critical importance for biological function. Among others, the list includes kinases, phosphatases, proteasomal regulatory subunit 4, kinase interacting proteins KIP1/KIP2, protozoan flagellar proteins, homologues of mitochondrial translocase TOM40, of the neuronal calcium sensor NCS-1 and of the cytochrome c-type heme lyase CCHL. Analyses of complete eukaryote genomes indicate that about 0.5 % of all encoded proteins are apparent NMT substrates except for a higher fraction in Arabidopsis thaliana ( approximately 0.8 %).  相似文献   

4.
Large-scale genome sequencing gained general importance for life science because functional annotation of otherwise experimentally uncharacterized sequences is made possible by the theory of biomolecular sequence homology. Historically, the paradigm of similarity of protein sequences implying common structure, function and ancestry was generalized based on studies of globular domains. Having the same fold imposes strict conditions over the packing in the hydrophobic core requiring similarity of hydrophobic patterns. The implications of sequence similarity among non-globular protein segments have not been studied to the same extent; nevertheless, homology considerations are silently extended for them. This appears especially detrimental in the case of transmembrane helices (TMs) and signal peptides (SPs) where sequence similarity is necessarily a consequence of physical requirements rather than common ancestry. Thus, matching of SPs/TMs creates the illusion of matching hydrophobic cores. Therefore, inclusion of SPs/TMs into domain models can give rise to wrong annotations. More than 1001 domains among the 10,340 models of Pfam release 23 and 18 domains of SMART version 6 (out of 809) contain SP/TM regions. As expected, fragment-mode HMM searches generate promiscuous hits limited to solely the SP/TM part among clearly unrelated proteins. More worryingly, we show explicit examples that the scores of clearly false-positive hits, even in global-mode searches, can be elevated into the significance range just by matching the hydrophobic runs. In the PIR iProClass database v3.74 using conservative criteria, we find that at least between 2.1% and 13.6% of its annotated Pfam hits appear unjustified for a set of validated domain models. Thus, false-positive domain hits enforced by SP/TM regions can lead to dramatic annotation errors where the hit has nothing in common with the problematic domain model except the SP/TM region itself. We suggest a workflow of flagging problematic hits arising from SP/TM-containing models for critical reconsideration by annotation users.  相似文献   

5.
MOTIVATION: Many important biological processes such as cell signaling, transport of membrane-impermeable molecules, cell-cell communication, cell recognition and cell adhesion are mediated by membrane proteins. Unfortunately, as these proteins are not water soluble, it is extremely hard to experimentally determine their structure. Therefore, improved methods for predicting the structure of these proteins are vital in biological research. In order to improve transmembrane topology prediction, we evaluate the combined use of both integrated signal peptide prediction and evolutionary information in a single algorithm. RESULTS: A new method (MEMSAT3) for predicting transmembrane protein topology from sequence profiles is described and benchmarked with full cross-validation on a standard data set of 184 transmembrane proteins. The method is found to predict both the correct topology and the locations of transmembrane segments for 80% of the test set. This compares with accuracies of 62-72% for other popular methods on the same benchmark. By using a second neural network specifically to discriminate transmembrane from globular proteins, a very low overall false positive rate (0.5%) can also be achieved in detecting transmembrane proteins. AVAILABILITY: An implementation of the described method is available both as a web server (http://www.psipred.net) and as downloadable source code from http://bioinf.cs.ucl.ac.uk/memsat. Both the server and source code files are free to non-commercial users. Benchmark and training data are also available from http://bioinf.cs.ucl.ac.uk/memsat.  相似文献   

6.
Based on the principle of dual prediction by segment hydrophobicity and nonpolar phase helicity, in concert with imposed threshold values of these two parameters, we developed the automated prediction program TM Finder that can successfully locate most transmembrane (TM) segments in proteins. The program uses the results of experiments on a series of host-guest TM segment mimic peptides of prototypic sequence KK AAAXAAAAAXAAWAAXAAAKKKK-amide (where X = each of the 20 commonly occurring amino acids) through which an HPLC-derived hydropathy scale, a hydrophobicity threshold for spontaneous membrane insertion, and a nonpolar phase helical propensity scale were determined. Using these scales, the optimized prediction algorithm of TM Finder defines TM segments by first searching for competent core segments using the combination of hydrophobicity and helicity scales, and then performs a gap-joining operation, which minimizes prediction bias caused by local hydrophilic residues and/or the choice of window size. In addition, the hydrophobicity threshold requirement enables TM Finder to distinguish reliably between membrane proteins and globular proteins, thereby adding an important dimension to the program. A full web version of the TM Finder program can be accessed at http://www.bioinformatics-canada.org/TM/.  相似文献   

7.
8.
The fungal transamidase complex that executes glycosylphosphatidylinositol (GPI) lipid anchoring of precursor proteins has overlapping but distinct sequence specificity compared with the animal system. Therefore, a taxon-specific prediction tool for the recognition of the C-terminal signal in fungal sequences is necessary. We have collected a learning set of fungal precursor protein sequences from the literature and fungal proteomes. Although the general four segment scheme of the recognition signal is maintained also in fungal precursors, there are taxon specificities in details. A fungal big-Pi predictor has been developed for the assessment of query sequence concordance with fungi-specific recognition signal requirements. The sensitivity of this predictor is close to 90%. The rate of false positive prediction is in the range of 0.1%. The fungal big-Pi tool successfully predicts the Gas1 mutation series described by C. Nuoffer and co-workers, and recognizes that the human PLAP C terminus is not a target for the fungal transamidase complex. Lists of potentially GPI lipid anchored proteins for five fungal proteomes have been generated and the hits have been functionally classified. The fungal big-Pi prediction WWW server as well as precursor lists are available at  相似文献   

9.
10.
A total of 20%-25% of the proteins in a typical genome are helical membrane proteins. The transmembrane regions of these proteins have markedly different properties when compared with globular proteins. This presents a problem when homology search algorithms optimized for globular proteins are applied to membrane proteins. Here we present modifications of the standard Smith-Waterman and profile search algorithms that significantly improve the detection of related membrane proteins. The improvement is based on the inclusion of information about predicted transmembrane segments in the alignment algorithm. This is done by simply increasing the alignment score if two residues predicted to belong to transmembrane segments are aligned with each other. Benchmarking over a test set of G-protein-coupled receptor sequences shows that the number of false positives is significantly reduced in this way, both when closely related and distantly related proteins are searched for.  相似文献   

11.
Park Y  Helms V 《Proteins》2006,64(4):895-905
The transmembrane (TM) domains of most membrane proteins consist of helix bundles. The seemingly simple task of TM helix bundle assembly has turned out to be extremely difficult. This is true even for simple TM helix bundle proteins, i.e., those that have the simple form of compact TM helix bundles. Herein, we present a computational method that is capable of generating native-like structural models for simple TM helix bundle proteins having modest numbers of TM helices based on sequence conservation patterns. Thus, the only requirement for our method is the presence of more than 30 homologous sequences for an accurate extraction of sequence conservation patterns. The prediction method first computes a number of representative well-packed conformations for each pair of contacting TM helices, and then a library of tertiary folds is generated by overlaying overlapping TM helices of the representative conformations. This library is scored using sequence conservation patterns, and a subsequent clustering analysis yields five final models. Assuming that neighboring TM helices in the sequence contact each other (but not that TM helices A and G contact each other), the method produced structural models of Calpha atom root-mean-square deviation (CA RMSD) of 3-5 A from corresponding crystal structures for bacteriorhodopsin, halorhodopsin, sensory rhodopsin II, and rhodopsin. In blind predictions, this type of contact knowledge is not available. Mimicking this, predictions were made for the rotor of the V-type Na(+)-adenosine triphosphatase without such knowledge. The CA RMSD between the best model and its crystal structure is only 3.4 A, and its contact accuracy reaches 55%. Furthermore, the model correctly identifies the binding pocket for sodium ion. These results demonstrate that the method can be readily applied to ab initio structure prediction of simple TM helix bundle proteins having modest numbers of TM helices.  相似文献   

12.
The prediction of transmembrane (TM) helix and topology provides important information about the structure and function of a membrane protein. Due to the experimental difficulties in obtaining a high-resolution model, computational methods are highly desirable. In this paper, we present a hierarchical classification method using support vector machines (SVMs) that integrates selected features by capturing the sequence-to-structure relationship and developing a new scoring function based on membrane protein folding. The proposed approach is evaluated on low- and high-resolution data sets with cross-validation, and the topology (sidedness) prediction accuracy reaches as high as 90%. Our method is also found to correctly predict both the location of TM helices and the topology for 69% of the low-resolution benchmark set. We also test our method for discrimination between soluble and membrane proteins and achieve very low overall false positive (0.5%) and false negative rates (0 to approximately 1.2%). Lastly, the analysis of the scoring function suggests that the topogeneses of single-spanning and multispanning TM proteins have different levels of complexity, and the consideration of interloop topogenic interactions for the latter is the key to achieving better predictions. This method can facilitate the annotation of membrane proteomes to extract useful structural and functional information. It is publicly available at http://bio-cluster.iis.sinica.edu.tw/~bioapp/SVMtop.  相似文献   

13.
Transmembrane helix (TMH) topology prediction is becoming a focal problem in bioinformatics because the structure of TM proteins is difficult to determine using experimental methods. Therefore, methods that can computationally predict the topology of helical membrane proteins are highly desirable. In this paper we introduce TMHindex, a method for detecting TMH segments using only the amino acid sequence information. Each amino acid in a protein sequence is represented by a Compositional Index, which is deduced from a combination of the difference in amino acid occurrences in TMH and non-TMH segments in training protein sequences and the amino acid composition information. Furthermore, a genetic algorithm was employed to find the optimal threshold value for the separation of TMH segments from non-TMH segments. The method successfully predicted 376 out of the 378 TMH segments in a dataset consisting of 70 test protein sequences. The sensitivity and specificity for classifying each amino acid in every protein sequence in the dataset was 0.901 and 0.865, respectively. To assess the generality of TMHindex, we also tested the approach on another standard 73-protein 3D helix dataset. TMHindex correctly predicted 91.8% of proteins based on TM segments. The level of the accuracy achieved using TMHindex in comparison to other recent approaches for predicting the topology of TM proteins is a strong argument in favor of our proposed method. Availability: The datasets, software together with supplementary materials are available at: http://faculty.uaeu.ac.ae/nzaki/TMHindex.htm.  相似文献   

14.
Modeling of integral membrane proteins and the prediction of their functional sites requires the identification of transmembrane (TM) segments and the determination of their angular orientations. Hydrophobicity scales predict accurately the location of TM helices, but are less accurate in computing angular disposition. Estimating lipid-exposure propensities of the residues from statistics of solved membrane protein structures has the disadvantage of relying on relatively few proteins. As an alternative, we propose here a scale of knowledge-based Propensities for Residue Orientation in Transmembrane segments (kPROT), derived from the analysis of more than 5000 non-redundant protein sequences. We assume that residues that tend to be exposed to the membrane are more frequent in TM segments of single-span proteins, while residues that prefer to be buried in the transmembrane bundle interior are present mainly in multi-span TMs. The kPROT value for each residue is thus defined as the logarithm of the ratio of its proportions in single and multiple TM spans. The scale is refined further by defining it for three discrete sections of the TM segment; namely, extracellular, central, and intracellular. The capacity of the kPROT scale to predict angular helical orientation was compared to that of alternative methods in a benchmark test, using a diversity of multi-span alpha-helical transmembrane proteins with a solved 3D structure. kPROT yielded an average angular error of 41 degrees, significantly lower than that of alternative scales (62 degrees -68 degrees ). The new scale thus provides a useful general tool for modeling and prediction of functional residues in membrane proteins. A WWW server (http://bioinfo.weizmann.ac.il/kPROT) is available for automatic helix orientation prediction with kPROT.  相似文献   

15.
MOTIVATION: The dearth of structural data on alpha-helical membrane proteins (MPs) has hampered thus far the development of reliable knowledge-based potentials that can be used for automatic prediction of transmembrane (TM) protein structure. While algorithms for identifying TM segments are available, modeling of the TM domains of alpha-helical MPs involves assembling the segments into a bundle. This requires the correct assignment of the buried and lipid-exposed faces of the TM domains. RESULTS: A recent increase in the number of crystal structures of alpha-helical MPs has enabled an analysis of the lipid-exposed surfaces and the interiors of such molecules on the basis of structure, rather than sequence alone. Together with a conservation criterion that is based on previous observations that conserved residues are mostly found in the interior of MPs, the bias of certain residue types to be preferably buried or exposed is proposed as a criterion for predicting the lipid-exposed and interior faces of TMs. Applications to known structures demonstrates 80% accuracy of this prediction algorithm. AVAILABILITY: The algorithm used for the predictions is implemented in the ProperTM Web server (http://icb.med.cornell.edu/services/propertm/start).  相似文献   

16.
The prediction of a protein's structure from its amino acid sequence has been a long-standing goal of molecular biology. In this work, a new set of conformational parameters for membrane spanning alpha helices was developed using the information from the topology of 70 membrane proteins. Based on these conformational parameters, a simple algorithm has been formulated to predict the transmembrane alpha helices in membrane proteins. A FORTRAN program has been developed which takes the amino acid sequence as input and gives the predicted transmembrane alpha-helices as output. The present method correctly identifies 295 transmembrane helical segments in 70 membrane proteins with only two overpredictions. Furthermore, this method predicts all 45 transmembrane helices in the photosynthetic reaction center, bacteriorhodopsin and cytochrome c oxidase to an 86% level of accuracy and so is better than all other methods published to date.  相似文献   

17.
The three Tritryps, the pathogenic protozoa, Leishmania major, Trypanosoma brucei and Trypanosoma cruzi use surface molecules among others to evolve strategies for evading the immune system and for their survival in the host systems. Since only 36% of the protein coding genes in L. major genome have a putative function ascribed to them, we undertook a genome analysis of L. major genome for identification of adhesin-like and other surface proteins from amongst these hypothetical sequences. Our analysis resulted in the identification of a total of 194 hits, 120 of which had a predicted transmembrane region, 56 had both a transmembrane and signal peptide region, 1 sequence had only a predicted signal peptide region whereas 17 sequences had neither of the two. Six protein sequences could be assigned a putative adhesin-like domain region based on the analysis. Hopefully future detailed experimental studies will elucidate more vividly the role of these hits in Leishmania pathogenesis.  相似文献   

18.
The YidC/Oxa1/Alb3 family of proteins catalyzes membrane protein insertion in bacteria, mitochondria, and chloroplasts. In this study, we investigated which regions of the bacterial YidC protein are important for its function in membrane protein biogenesis. In Escherichia coli, YidC spans the membrane six times, with a large 319-residue periplasmic domain following the first transmembrane domain. We found that this large periplasmic domain is not required for YidC function and that the residues in the exposed hydrophilic loops or C-terminal tail are not critical for YidC activity. Rather, the five C-terminal transmembrane segments that contain the three consensus sequences in the YidC/Oxa1/Alb3 family are important for its function. However, by systematically replacing all the residues in transmembrane segment (TM) 2, TM3, and TM6 with serine and by swapping TM4 and TM5 with unrelated transmembrane segments, we show that the precise sequence of these transmembrane regions is not essential for in vivo YidC activity. Single serine mutations in TM2, TM3, and TM6 impaired the membrane insertion of the Sec-independent procoat-leader peptidase protein. We propose that the five C-terminal transmembrane segments of YidC function as a platform for the translocating substrate protein to support its insertion into the membrane.  相似文献   

19.
Growth factor receptors are typically activated by the binding of soluble ligands to the extracellular domain of the receptor, but certain viral transmembrane proteins can induce growth factor receptor activation by binding to the receptor transmembrane domain. For example, homodimers of the transmembrane 44-amino acid bovine papillomavirus E5 protein bind the transmembrane region of the PDGF beta receptor tyrosine kinase, causing receptor dimerization, phosphorylation, and cell transformation. To determine whether it is possible to select novel biologically active transmembrane proteins that can activate growth factor receptors, we constructed and identified small proteins with random hydrophobic transmembrane domains that can bind and activate the PDGF beta receptor. Remarkably, cell transformation was induced by approximately 10% of the clones in a library in which 15 transmembrane amino acid residues of the E5 protein were replaced with random hydrophobic sequences. The transformation-competent transmembrane proteins formed dimers and stably bound and activated the PDGF beta receptor. Genetic studies demonstrated that the biological activity of the transformation-competent proteins depended on specific interactions with the transmembrane domain of the PDGF beta receptor. A consensus sequence distinct from the wild-type E5 sequence was identified that restored transforming activity to a non-transforming poly-leucine transmembrane sequence, indicating that divergent transmembrane sequence motifs can activate the PDGF beta receptor. Molecular modeling suggested that diverse transforming sequences shared similar protein structure, including the same homodimer interface as the wild-type E5 protein. These experiments have identified novel proteins with transmembrane sequences distinct from the E5 protein that can activate the PDGF beta receptor and transform cells. More generally, this approach may allow the creation and identification of small proteins that modulate the activity of a variety of cellular transmembrane proteins.  相似文献   

20.
In order to identify new transmembrane helix packing motifs in naturally occurring proteins, we have selected transmembrane domains from a library of random Escherichia coli genomic DNA fragments and screened them for homomultimerization via their abilities to dimerize the bacteriophage lambda cI repressor DNA-binding domain. Sequences were isolated using a modified lambda cI headpiece dimerization assay system, which was shown previously to measure transmembrane helix-helix association in the E. coli inner membrane. Screening resulted in the identification of several novel sequences that appear to mediate helix-helix interactions. One sequence, representing the predicted sixth transmembrane domain (TM6) of the E. coli protein YjiO, was chosen for further analysis. Using site-directed mutagenesis and molecular dynamics, a small set of models for YjiO TM6 multimerization interface interactions were generated. This work demonstrates the utility of combining in vivo genetic tools with computational systems for understanding membrane protein structure and assembly.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号