首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 468 毫秒
1.
We present here a neural network-based method for detection of signal peptides (abbreviation used: SP) in proteins. The method is trained on sequences of known signal peptides extracted from the Swiss-Prot protein database and is able to work separately on prokaryotic and eukaryotic proteins. A query protein is dissected into overlapping short sequence fragments, and then each fragment is analyzed with respect to the probability of it being a signal peptide and containing a cleavage site. While the accuracy of the method is comparable to that of other existing prediction tools, it provides a significantly higher speed and portability. The accuracy of cleavage site prediction reaches 73% on heterogeneous source data that contains both prokaryotic and eukaryotic sequences while the accuracy of discrimination between signal peptides and non-signal peptides is above 93% for any source dataset. As a consequence, the method can be easily applied to genome-wide datasets. The software can be downloaded freely from http://rpsp.bioinfo.pl/RPSP.tar.gz.  相似文献   

2.
Membrane associated and secreted proteins are translated as precursors containing a signal peptide that allows protein-insertion into the membrane of the endoplasmic reticulum and is co-translationally removed in the lumen. The ability of the signal peptide to direct a polypeptide into the secretory pathway is exploited in methods developed to select cDNAs encoding such proteins. Different strategies are known in which cDNA libraries can be screened for signal peptides by the ability of the latter to rescue the translocation of signal sequence-less proteins. In one method, a cDNA library is tested for interleukin 2 receptor α chain translocation to the membrane in COS cells, in another one for invertase secretion from yeast. In this work, we compared the two systems by testing six mouse signal peptides in COS and yeast cells. All of them were functional in the mammalian system, whereas only three of them in yeast. Two other sequences needed the 5′ cDNA sequence flanking the ATG codon to be removed in order to enable protein translocation. Although the structure of signal sequences and the functioning of the secretory machinery are well conserved from prokaryotes to eukaryotes, it seems evident that not all signal peptides can be interchanged between different proteins and organisms. In particular, signal peptides that are functional in the mammalian system do not necessarily lead to protein translocation in yeast. Received: 9 March 2001  相似文献   

3.
We have designed two original sets of oligonucleotide primers hybridizing the relatively conserved motifs within the immunoglobulin signal sequences of each of the 15 heavy chain and 18 kappa light chain gene families. Comparison of these 5' primers with the immunoglobulin signal sequences referenced in the Kabat database suggests that these oligonucleotide primers should hybridize with 89.4% of the 428 mouse heavy chain signal sequences and with 91.8% of the 320 kappa light chain signal sequences with no mismatch. Following PCR amplification using the designed primers and direct sequencing of the amplified products, we obtained full-length variable sequences belonging to major (V(H)1, V(H)2, V(H)3, Vkappa1 and Vkappa21) but also small-sized (V(H)9, V(H)14, Vkappa2, Vkappa9A/9B, Vkappa12/13, Vkappa23 and Vkappa33/34) gene families, from nine murine monoclonal antibodies. This strategy could be a powerful tool for antibody sequence assessment whatever the V gene family before humanization of mouse monoclonal antibody or identification of paratope-derived peptides.  相似文献   

4.
This review discusses efforts to understand the mode of action of signal sequences by biophysical study of synthetic peptides corresponding to these protein localization signals. On the basis of reports from several laboratories, it is now clear that signal peptides may adopt a variety of conformations, depending on their local environment. In membrane-mimetic systems like detergent micelles or lipid vesicles, they have a high tendency to form helices. Ability to take up a helical conformation appears to be required at some point in the function of a signal sequence, since some peptides corresponding to export-defective signal sequences display reduced helical potential. By contrast, functional signal sequences share a high capacity to adopt helices. High affinity for organized lipid assemblies, like monolayers or vesicles, is also a property of functional signal sequences. This correlation suggests a role for direct interaction of signal sequences with the lipids of the cytoplasmic membranein vivo. Supporting this role are studies of the influence of signal peptides on lipid structure, which reveal an ability of these peptides to pertub lipid packing and to alter the phase state of the lipids. Insertion of the signal sequencein vivo could substantially reduce the barrier for translocation of the mature chain. Lastly, synthetic signal peptides have been added to native membranes and found to inhibit translocation of precursor proteins. This approach bridges the biophysical and the biochemical aspects of protein export and promises to shed light on the functional correlates of the properties and interactions observed in model systems.  相似文献   

5.
SPEPlip: the detection of signal peptide and lipoprotein cleavage sites   总被引:2,自引:0,他引:2  
SUMMARY: SPEPlip is a neural network-based method, trained and tested on a set of experimentally derived signal peptides from eukaryotes and prokaryotes. SPEPlip identifies the presence of sorting signals and predicts their cleavage sites. The accuracy in cross-validation is similar to that of other available programs: the rate of false positives is 4 and 6%, for prokaryotes and eukaryotes respectively and that of false negatives is 3% in both cases. When a set of 409 prokaryotic lipoproteins is predicted, SPEPlip predicts 97% of the chains in the signal peptide class. However, by integrating SPEPlip with a regular expression search utility based on the PROSITE pattern, we can successfully discriminate signal peptide-containing chains from lipoproteins. We propose the method for detecting and discriminating signal peptides containing chains and lipoproteins. AVAILABILITY: It can be accessed through the web page at http://gpcr.biocomp.unibo.it/predictors/  相似文献   

6.
As the knowledge of protein signal peptides can be used to reprogram cells in a desired way for gene therapy, signal peptides have become a crucial tool for researchers to design new drugs for targeting a particular organelle to correct a specific defect. To effectively use such a technique, however, we have to develop an automated method for fast and accurately predicting signal peptides and their cleavage sites, particularly in the post-genomic era when the number of protein sequences is being explosively increased. To realize this, the first important thing is to discriminate secretory proteins from non-secretory proteins. On the basis of the Needleman-Wunsch algorithm, we proposed a new alignment kernel function. The novel approach can be effectively used to extract the statistical properties of protein sequences for machine learning, leading to a higher prediction success rate.  相似文献   

7.
Recent advances in high-throughput technologies have made it possible to generate both gene and protein sequence data at an unprecedented rate and scale thereby enabling entirely new "omics"-based approaches towards the analysis of complex biological processes. However, the amount and complexity of data that even a single experiment can produce seriously challenges researchers with limited bioinformatics expertise, who need to handle, analyze and interpret the data before it can be understood in a biological context. Thus, there is an unmet need for tools allowing non-bioinformatics users to interpret large data sets. We have recently developed a method, NNAlign, which is generally applicable to any biological problem where quantitative peptide data is available. This method efficiently identifies underlying sequence patterns by simultaneously aligning peptide sequences and identifying motifs associated with quantitative readouts. Here, we provide a web-based implementation of NNAlign allowing non-expert end-users to submit their data (optionally adjusting method parameters), and in return receive a trained method (including a visual representation of the identified motif) that subsequently can be used as prediction method and applied to unknown proteins/peptides. We have successfully applied this method to several different data sets including peptide microarray-derived sets containing more than 100,000 data points. NNAlign is available online at http://www.cbs.dtu.dk/services/NNAlign.  相似文献   

8.
The molecular evolution of signal peptides   总被引:5,自引:0,他引:5  
Williams EJ  Pal C  Hurst LD 《Gene》2000,253(2):313-322
Signal peptides direct mature peptides to their appropriate cellular location, after which they are cleaved off. Very many random alternatives can serve the same function. Of all coding sequences, therefore, signal peptides might come closest to being neutrally evolving. Here we consider this issue by examining the molecular evolution of 76 mouse-rat orthologues, each with defined signal peptides. Although they do evolve rapidly, they evolve about half as fast as neutral sequences. This indicates that a substantial proportion of mutations must be under stabilizing selection. A few putative signal sequences lack a hydrophobic core and these tend to be more slowly evolving than others, indicating even stronger stabilizing selection. However, closer scrutiny suggests that some of these represent mis-annotations in GenBank. It is also likely that some of the substitutions are not neutral. We find, for example, that the rate of protein evolution correlates with that of the mature peptide. This may be a result of compensatory evolution. We also find that signal peptides of immune genes tend to be faster evolving than the average, which suggests an association with antagonistic co-evolution. Previous reports also indicated that the signal peptide of the imprinted gene, Igf2r, is also unusually fast evolving. This, it was hypothesized, might also be indicative of antagonistic co-evolution. Comparison of Igf2r's signal peptide evolution shows that, although it is not an outlier, its rate of evolution is comparable to that of many of the faster evolving immune system signal sequences and 5/6 of the amino acid changes do not conserve hydrophobicity. This is at least suggestive that there is something unusual about Igf2r's signal sequence.  相似文献   

9.
Three randomly derived sequences that can substitute for the signal peptide of Saccharomyces cerevisiae invertase were tested for the efficiency with which they can translocate invertase or beta-galactosidase into the endoplasmic reticulum. The rate of translocation, as measured by glycosylation, was estimated in pulse-chase experiments to be less than 6 min. When fused to beta-galactosidase, these peptides, like the normal invertase signal sequence, direct the hybrid protein to a perinuclear region, consistent with localization to the endoplasmic reticulum. The diversity of function of random peptides was studied further by immunofluorescence localization of proteins fused to 28 random sequences: 4 directed the hybrid to the endoplasmic reticulum, 3 directed it to the mitochondria, and 1 directed it to the nucleus.  相似文献   

10.
Wang M  Yang J  Chou KC 《Amino acids》2005,28(4):395-402
Summary. Owing to the importance of signal peptides for studying the molecular mechanisms of genetic diseases, reprogramming cells for gene therapy, and finding new drugs for healing a specific defect, it is in great demand to develop a fast and accurate method to identify the signal peptides. Introduction of the so-called {−3,−1, +1} coupling model (Chou, K. C.: Protein Engineering, 2001, 14–2, 75–79) has made it possible to take into account the coupling effect among some key subsites and hence can significantly enhance the prediction quality of peptide cleavage site. Based on the subsite coupling model, a kind of string kernels for protein sequence is introduced. Integrating the biologically relevant prior knowledge, the constructed string kernels can thus be used by any kernel-based method. A Support vector machines (SVM) is thus built to predict the cleavage site of signal peptides from the protein sequences. The current approach is compared with the classical weight matrix method. At small false positive ratios, our method outperforms the classical weight matrix method, indicating the current approach may at least serve as a powerful complemental tool to other existing methods for predicting the signal peptide cleavage site. The software that generated the results reported in this paper is available upon requirement, and will appear at http://www.pami.sjtu.edu.cn/wm. An erratum to this article is available at .  相似文献   

11.
A signal peptide is required for entry of a preprotein into the secretory pathway, but how it functions in concert with the other transport components is unknown. In Escherichia coli, SecA is a key component of the translocation machinery found in the cytoplasm and at membrane translocation sites. Synthetic signal peptides corresponding to the wild type alkaline phosphatase signal sequence and three sets of model signal sequences varying in hydrophobicity and amino-terminal charge were generated. These were used to establish the requirements for interaction with SecA. Binding to SecA, modulation of SecA conformations sensitive to protease, and stimulation of SecA-lipid ATPase activity occur with functional signal sequences but not with transport-incompetent ones. The extent of SecA interaction is directly related to the hydrophobicity of the signal peptide core region. For signal peptides of moderate hydrophobicity, stimulation of the SecA-lipid ATPase activity is also dependent on amino-terminal charge. The results demonstrate unequivocally that the signal peptide, in the absence of the mature protein, interacts with SecA in aqueous solution and in a lipid bilayer. We show a clear parallel between the hierarchy of signal peptide characteristics that promote interaction with SecA in vitro and the hierarchy of those observed for function in vivo.  相似文献   

12.
13.
Signal peptides and transmembrane helices both contain a stretch of hydrophobic amino acids. This common feature makes it difficult for signal peptide and transmembrane helix predictors to correctly assign identity to stretches of hydrophobic residues near the N-terminal methionine of a protein sequence. The inability to reliably distinguish between N-terminal transmembrane helix and signal peptide is an error with serious consequences for the prediction of protein secretory status or transmembrane topology. In this study, we report a new method for differentiating protein N-terminal signal peptides and transmembrane helices. Based on the sequence features extracted from hydrophobic regions (amino acid frequency, hydrophobicity, and the start position), we set up discriminant functions and examined them on non-redundant datasets with jackknife tests. This method can incorporate other signal peptide prediction methods and achieve higher prediction accuracy. For Gram-negative bacterial proteins, 95.7% of N-terminal signal peptides and transmembrane helices can be correctly predicted (coefficient 0.90). Given a sensitivity of 90%, transmembrane helices can be identified from signal peptides with a precision of 99% (coefficient 0.92). For eukaryotic proteins, 94.2% of N-terminal signal peptides and transmembrane helices can be correctly predicted with coefficient 0.83. Given a sensitivity of 90%, transmembrane helices can be identified from signal peptides with a precision of 87% (coefficient 0.85). The method can be used to complement current transmembrane protein prediction and signal peptide prediction methods to improve their prediction accuracies.  相似文献   

14.
Tjalsma H  van Dijl JM 《Proteomics》2005,5(17):4472-4482
The availability of complete bacterial genome sequences allows proteome-wide predictions of exported proteins that are potentially retained in the cytoplasmic membranes of the corresponding organisms. In practice, however, major problems are encountered with the computer-assisted distinction between (Sec-type) signal peptides that direct exported proteins into the growth medium and lipoprotein signal peptides or amino-terminal membrane anchors that cause protein retention in the membrane. In the present studies, which were aimed at improving methods to predict protein retention in the bacterial cytoplasmic membrane, we have compared sets of membrane-attached and extracellular proteins of Bacillus subtilis that were recently identified through proteomics approaches. The results showed that three classes of membrane-attached proteins can be distinguished. Two classes include 43 lipoproteins and 48 proteins with an amino-terminal transmembrane segment, respectively. Remarkably, a third class includes 31 proteins that remain membrane-retained despite the presence of typical Sec-type signal peptides with consensus signal peptidase recognition sites. This unprecedented finding indicates that unknown mechanisms are involved in membrane retention of this class of proteins. A further novelty is a consensus sequence indicative for release of certain lipoproteins from the membrane by proteolytic shaving. Finally, using non-overlapping sets of secreted and membrane-retained proteins, the accuracy of different signal peptide prediction algorithms was assessed. Accuracy for the prediction of protein retention in the membrane was increased to 82% using a majority-vote approach. Our findings provide important leads for future identification of surface proteins from pathogenic bacteria, which are attractive candidate infection markers and potential targets for drugs or vaccines.  相似文献   

15.
ABSTRACT: BACKGROUND: Classification is difficult for shotgun metagenomics data from environments such as soils, where the diversity of sequences is high and where reference sequences from close relatives may not exist. Approaches based on sequence-similarity scores must deal with the confounding effects that inheritance and functional pressures exert on the relation between scores and phylogenetic distance, while approaches based on sequence alignment and tree-building are typically limited to a small fraction of gene families. We describe an approach based on finding one or more exact matches between a read and a precomputed set of peptide 10-mers. RESULTS: At even the largest phylogenetic distances, thousands of 10-mer peptide exact matches can be found between pairs of bacterial genomes. Genes that share one or more peptide 10-mers typically have high reciprocal BLAST scores. Among a set of 403 representative bacterial genomes, some 20 million 10-mer peptides were found to be shared. We assign each of these peptides as a signature of a particular node in a phylogenetic reference tree based on the RNA polymerase genes. We classify the phylogeny of a genomic fragment (e.g., read) at the most specific node on the reference tree that is consistent with the phylogeny of observed signature peptides it contains. Using both synthetic data from four newly-sequenced soil-bacterium genomes and ten real soil metagenomics data sets, we demonstrate a sensitivity and specificity comparable to that of the MEGAN metagenomics analysis package using BLASTX against the NR database. Phylogenetic and functional similarity metrics applied to real metagenomics data indicates a signal-to-noise ratio of approximately 400 for distinguishing among environments. Our method assigns ~6.6 Gbp/hr on a single CPU, compared with 25 kbp/hr for methods based on BLASTX against the NR database. CONCLUSIONS: Classification by exact matching against a precomputed list of signature peptides provides comparable results to existing techniques for reads longer than about 300 bp and does not degrade severely with shorter reads. Orders of magnitude faster than existing methods, the approach is suitable now for inclusion in analysis pipelines and appears to be extensible in several different directions.  相似文献   

16.
N-terminal signal sequences can direct nascent protein chains to the inner membrane of prokaryotes and the endoplasmic reticulum of eukaryotes by interacting with the signal recognition particle. In this study, we show that isolated peptides corresponding to several bacterial signal sequences inhibit the GTPase activity of the Escherichia coli signal recognition particle, as previously reported (Miller, J. D., Bernstein, H. D., and Walter, P. (1994) Nature 367, 657-659), but not by the direct mechanism proposed. Instead, isolated signal peptides bind nonspecifically to the RNA component and aggregate the entire signal recognition particle, leading to a loss of its intrinsic GTPase activity. Surprisingly, only "functional" peptide sequences aggregate RNA; the peptides in general use as "nonfunctional" negative controls (e.g. those with deletions or charged substitutions within the hydrophobic core), are sufficiently different in physical character that they do not aggregate RNA and thus have no effect on the GTPase activity of the signal recognition particle. We propose that the reported effect of functional signal peptides on the GTPase activity of the signal recognition particle is an artifact of the high peptide concentrations and low salt conditions used in these in vitro studies and that signal sequences at the N terminus of nascent chains in vivo do not exhibit this activity.  相似文献   

17.
We have screened a Hydra cDNA library for sequences encoding N-terminal signal peptides using the yeast invertase secretion vector pSUC [Jacobs et al., 1997. A genetic selection for isolating cDNAs encoding secreted proteins. Gene 198, 289-296]. We isolated and sequenced 907 positive clones; 88% encoded signal peptides; 12% lacked signal peptides. By searching the Hydra EST database we identified full-length sequences for the selected clones. These encoded 37 known proteins with signal peptides and 40 novel Hydra-specific proteins with signal peptides. Localization of two signal peptide-containing sequences, VEGF and ferritin, to the secretory pathway was confirmed with GFP fusion proteins. In addition, we isolated 105 clones which lacked signal peptides but which supported invertase secretion from yeast. Isolation of plasmids from these clones and retransformation in invertase-negative yeast cells confirmed the phenotype. A GFP fusion protein of one such clone encoding the foot morphogen pedibin was localized to the cytoplasm in transfected Hydra cells and did not enter the ER/Golgi secretory pathway. Secretion of pedibin and other proteins lacking signal peptides appears to occur by a non-classical protein secretion route.  相似文献   

18.
The information for correct localization of newly synthesized proteins in both prokaryotes and eukaryotes resides in self-contained, often transportable targeting sequences. Of these, signal sequences specify that a protein should be secreted from a cell or incorporated into the cytoplasmic membrane. A central puzzle is presented by the lack of primary structural homology among signal sequences, although they share common features in their sequences. Synthetic signal peptides have enabled a wide range of studies of how these "zipcodes" for protein secretion are decoded and used to target proteins to the protein machinery that facilitates their translocation across and integration into membranes. We review research on how the information in signal sequences enables their passenger proteins to be correctly and efficiently localized. Synthetic signal peptides have made possible binding and crosslinking studies to explore how selectivity is achieved in recognition by the signal sequence-binding receptors, signal recognition particle, or SRP, which functions in all organisms, and SecA, which functions in prokaryotes and some organelles of prokaryotic origins. While progress has been made, the absence of atomic resolution structures for complexes of signal peptides and their receptors has definitely left many questions to be answered in the future.  相似文献   

19.
唐雯  严明 《微生物学报》2008,48(4):473-479
[目的]里氏木霉是一种重要的产纤维素酶工业用菌种,研究其分泌组特性具有现实意义.[方法]应用生物信息学方法对里氏木霉基因组中9997个开放阅读框(ORF)所编码的氨基酸序列进行了分析,获得了294条可能的分泌蛋白序列,并且按功能对其进行了分类,同时用搜索模体的方法在未知功能的序列中找到具有关键模体的序列,初步确定其潜在的功能.对获得的分泌蛋白的信号肽序列进行了分析.[结果]里氏木霉分泌组中有188种水解酶,包括114种糖苷水解酶、42种蛋白水解酶和11种脂类水解酶等;在糖苷水解酶中包括已报道的22种纤维素酶和15种几丁质酶等,以及30条具有潜在纤维素酶功能的蛋白序列.信号肽序列分析结果表明其同源性较低,而在信号肽酶切位点附近则相对保守.[结论]通过该预测和分析开拓了里氏木霉的研究空间,为今后的研究奠定了理论基础.  相似文献   

20.
MOTIVATION: Experimental evidence suggests that certain short protein segments have stronger amyloidogenic propensities than others. Identification of the fibril-forming segments of proteins is crucial for understanding diseases associated with protein misfolding and for finding favorable targets for therapeutic strategies. RESULT: In this study, we used the microcrystal structure of the NNQQNY peptide from yeast prion protein and residue-based statistical potentials to establish an algorithm to identify the amyloid fibril-forming segment of proteins. Using the same sets of sequences, a comparable prediction performance was obtained from this study to that from 3D profile method based on the physical atomic-level potential ROSETTADESIGN. The predicted results are consistent with experiments for several representative proteins associated with amyloidosis, and also agree with the idea that peptides that can form fibrils may have strong sequence signatures. Application of the residue-based statistical potentials is computationally more efficient than using atomic-level potentials and can be applied in whole proteome analysis to investigate the evolutionary pressure effect or forecast other latent diseases related to amyloid deposits. AVAILABILITY: The fibril prediction program is available at ftp://mdl.ipc.pku.edu.cn/pub/software/pre-amyl/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号