首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
A neural network-based tool, TargetP, for large-scale subcellular location prediction of newly identified proteins has been developed. Using N-terminal sequence information only, it discriminates between proteins destined for the mitochondrion, the chloroplast, the secretory pathway, and "other" localizations with a success rate of 85% (plant) or 90% (non-plant) on redundancy-reduced test sets. From a TargetP analysis of the recently sequenced Arabidopsis thaliana chromosomes 2 and 4 and the Ensembl Homo sapiens protein set, we estimate that 10% of all plant proteins are mitochondrial and 14% chloroplastic, and that the abundance of secretory proteins, in both Arabidopsis and Homo, is around 10%. TargetP also predicts cleavage sites with levels of correctly predicted sites ranging from approximately 40% to 50% (chloroplastic and mitochondrial presequences) to above 70% (secretory signal peptides). TargetP is available as a web-server at http://www.cbs.dtu.dk/services/TargetP/.  相似文献   

3.
Dictyostelium discoideum has been suggested as a eukaryotic model organism for glycobiology studies. Presently, the characteristics of acceptor sites for the N-acetylglucosaminyl-transferases in Dictyostelium discoideum, which link GlcNAc in an alpha linkage to hydroxyl residues, are largely unknown. This motivates the development of a species specific method for prediction of O-linked GlcNAc glycosylation sites in secreted and membrane proteins of D. discoideum. The method presented here employs a jury of artificial neural networks. These networks were trained to recognize the sequence context and protein surface accessibility in 39 experimentally determined O-alpha-GlcNAc sites found in D. discoideum glycoproteins expressed in vivo. Cross-validation of the data revealed a correlation in which 97% of the glycosylated and nonglycosylated sites were correctly identified. Based on the currently limited data set, an abundant periodicity of two (positions-3, -1, +1, +3, etc.) in Proline residues alternating with hydroxyl amino acids was observed upstream and downstream of the acceptor site. This was a consequence of the spacing of the glycosylated residues themselves which were peculiarly found to be situated only at even positions with respect to each other, indicating that these may be located within beta-strands. The method has been used for a rapid and ranked scan of the fraction of the Dictyostelium proteome available in public databases, remarkably 25-30% of which were predicted glycosylated. The scan revealed acceptor sites in several proteins known experimentally to be O-glycosylated at unmapped sites. The available proteome was classified into functional and cellular compartments to study any preferential patterns of glycosylation. A sequence based prediction server for GlcNAc O-glycosylations in D. discoideum proteins has been made available through the WWW at http://www.cbs.dtu.dk/services/DictyOGlyc/ and via E-mail to DictyOGlyc@cbs.dtu.dk.  相似文献   

4.
We have developed a new method for the identification of signal peptides and their cleavage sites based on neural networks trained on separate sets of prokaryotic and eukaryotic sequences. The method performs significantly better than previous prediction schemes, and can easily be applied to genome-wide data sets. Discrimination between cleaved signal peptides and uncleaved N-terminal signal-anchor sequences is also possible, though with lower precision. Predictions can be made on a publicly available WWW server: http://www.cbs.dtu.dk/services/SignalP/.  相似文献   

5.
Protein phosphorylation plays a key role in cell regulation and identification of phosphorylation sites is important for understanding their functional significance. Here, we present an artificial neural network algorithm: NetPhosK (http://www.cbs.dtu.dk/services/NetPhosK/) that predicts protein kinase A (PKA) phosphorylation sites. The neural network was trained with a positive set of 258 experimentally verified PKA phosphorylation sites. The predictions by NetPhosK were validated using four novel PKA substrates: Necdin, RFX5, En-2, and Wee 1. The four proteins were phosphorylated by PKA in vitro and 13 PKA phosphorylation sites were identified by mass spectrometry. NetPhosK was 100% sensitive and 41% specific in predicting PKA sites in the four proteins. These results demonstrate the potential of using integrated computational and experimental methods for detailed investigations of the phosphoproteome.  相似文献   

6.
We present a neural network based method (ChloroP) for identifying chloroplast transit peptides and their cleavage sites. Using cross-validation, 88% of the sequences in our homology reduced training set were correctly classified as transit peptides or nontransit peptides. This performance level is well above that of the publicly available chloroplast localization predictor PSORT. Cleavage sites are predicted using a scoring matrix derived by an automatic motif-finding algorithm. Approximately 60% of the known cleavage sites in our sequence collection were predicted to within +/-2 residues from the cleavage sites given in SWISS-PROT. An analysis of 715 Arabidopsis thaliana sequences from SWISS-PROT suggests that the ChloroP method should be useful for the identification of putative transit peptides in genome-wide sequence data. The ChloroP predictor is available as a web-server at http://www.cbs.dtu.dk/services/ChloroP/.  相似文献   

7.
The specificities of the UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferases which link the carbohydrate GalNAc to the side-chain of certain serine and threonine residues in mucin type glycoproteins, are presently unknown. The specificity seems to be modulated by sequence context, secondary structure and surface accessibility. The sequence context of glycosylated threonines was found to differ from that of serine, and the sites were found to cluster. Non-clustered sites had a sequence context different from that of clustered sites. Charged residues were disfavoured at position – 1 and +3. A jury of artificial neural networks was trained to recognize the sequence context and surface accessibility of 299 known and verified mucin type O-glycosylation sites extracted from O-GLYCBASE. The cross-validated NetOglyc network system correctly found 83% of the glycosylated and 90% of the non-glycosylated serine and threonine residues in independent test sets, thus proving more accurate than matrix statistics and vector projection methods. Predictions of O-glycosylation sites in the envelope glycoprotein gp120 from the primate lentiviruses HIV-1, HIV-2 and SIV are presented. The most conserved O-glycosylation signals in these evolutionary-related glycoproteins were found in their first hypervariable loop, V1. However, the strain variation for HIV-1 gp120 was significant. A computer server, available through WWW or E-mail, has been developed for prediction of mucin type O-glycosylation sites in proteins based on the amino acid sequence. The server addresses are http://www.cbs.dtu.dk/services/NetOGlyc/ and netOglyc@cbs.dtu.dk.  相似文献   

8.
O-GalNAc-glycosylation is one of the main types of glycosylation in mammalian cells. No consensus recognition sequence for the O-glycosyltransferases is known, making prediction methods necessary to bridge the gap between the large number of known protein sequences and the small number of proteins experimentally investigated with regard to glycosylation status. From O-GLYCBASE a total of 86 mammalian proteins experimentally investigated for in vivo O-GalNAc sites were extracted. Mammalian protein homolog comparisons showed that a glycosylated serine or threonine is less likely to be precisely conserved than a nonglycosylated one. The Protein Data Bank was analyzed for structural information, and 12 glycosylated structures were obtained. All positive sites were found in coil or turn regions. A method for predicting the location for mucin-type glycosylation sites was trained using a neural network approach. The best overall network used as input amino acid composition, averaged surface accessibility predictions together with substitution matrix profile encoding of the sequence. To improve prediction on isolated (single) sites, networks were trained on isolated sites only. The final method combines predictions from the best overall network and the best isolated site network; this prediction method correctly predicted 76% of the glycosylated residues and 93% of the nonglycosylated residues. NetOGlyc 3.1 can predict sites for completely new proteins without losing its performance. The fact that the sites could be predicted from averaged properties together with the fact that glycosylation sites are not precisely conserved indicates that mucin-type glycosylation in most cases is a bulk property and not a very site-specific one. NetOGlyc 3.1 is made available at www.cbs.dtu.dk/services/netoglyc.  相似文献   

9.
Artificial neural networks have been combined with a rule based system to predict intron splice sites in the dicot plant Arabidopsis thaliana. A two step prediction scheme, where a global prediction of the coding potential regulates a cutoff level for a local prediction of splice sites, is refined by rules based on splice site confidence values, prediction scores, coding context and distances between potential splice sites. In this approach, the prediction of splice sites mutually affect each other in a non-local manner. The combined approach drastically reduces the large amount of false positive splice sites normally haunting splice site prediction. An analysis of the errors made by the networks in the first step of the method revealed a previously unknown feature, a frequent T-tract prolongation containing cryptic acceptor sites in the 5' end of exons. The method presented here has been compared with three other approaches, GeneFinder, Gene-Mark and Grail. Overall the method presented here is an order of magnitude better. We show that the new method is able to find a donor site in the coding sequence for the jelly fish Green Fluorescent Protein, exactly at the position that was experimentally observed in A.thaliana transformants. Predictions for alternatively spliced genes are also presented, together with examples of genes from other dicots, monocots and algae. The method has been made available through electronic mail (NetPlantGene@cbs.dtu.dk), or the WWW at http://www.cbs.dtu.dk/NetPlantGene.html  相似文献   

10.
NetPhosYeast: prediction of protein phosphorylation sites in yeast   总被引:3,自引:0,他引:3  
We here present a neural network-based method for the prediction of protein phosphorylation sites in yeast--an important model organism for basic research. Existing protein phosphorylation site predictors are primarily based on mammalian data and show reduced sensitivity on yeast phosphorylation sites compared to those in humans, suggesting the need for an yeast-specific phosphorylation site predictor. NetPhosYeast achieves a correlation coefficient close to 0.75 with a sensitivity of 0.84 and specificity of 0.90 and outperforms existing predictors in the identification of phosphorylation sites in yeast. AVAILABILITY: The NetPhosYeast prediction service is available as a public web server at http://www.cbs.dtu.dk/services/NetPhosYeast/.  相似文献   

11.
Prediction of proteasome cleavage motifs by neural networks   总被引:20,自引:0,他引:20  
We present a predictive method that can simulate an essential step in the antigen presentation in higher vertebrates, namely the step involving the proteasomal degradation of polypeptides into fragments which have the potential to bind to MHC Class I molecules. Proteasomal cleavage prediction algorithms published so far were trained on data from in vitro digestion experiments with constitutive proteasomes. As a result, they did not take into account the characteristics of the structurally modified proteasomes--often called immunoproteasomes--found in cells stimulated by gamma-interferon under physiological conditions. Our algorithm has been trained not only on in vitro data, but also on MHC Class I ligand data, which reflect a combination of immunoproteasome and constitutive proteasome specificity. This feature, together with the use of neural networks, a non-linear classification technique, make the prediction of MHC Class I ligand boundaries more accurate: 65% of the cleavage sites and 85% of the non-cleavage sites are correctly determined. Moreover, we show that the neural networks trained on the constitutive proteasome data learns a specificity that differs from that of the networks trained on MHC Class I ligands, i.e. the specificity of the immunoproteasome is different than the constitutive proteasome. The tools developed in this study in combination with a predictor of MHC and TAP binding capacity should give a more complete prediction of the generation and presentation of peptides on MHC Class I molecules. Here we demonstrate that such an approach produces an accurate prediction of the CTL the epitopes in HIV Nef. The method is available at www.cbs.dtu.dk/services/NetChop/.  相似文献   

12.
Gao F  Ou HY  Chen LL  Zheng WX  Zhang CT 《FEBS letters》2003,553(3):451-456
Recently, we have developed a coronavirus-specific gene-finding system, ZCURVE_CoV 1.0. In this paper, the system is further improved by taking the prediction of cleavage sites of viral proteinases in polyproteins into account. The cleavage sites of the 3C-like proteinase and papain-like proteinase are highly conserved. Based on the method of traditional positional weight matrix trained by the peptides around cleavage sites, the present method also sufficiently considers the length conservation of non-structural proteins cleaved by the 3C-like proteinase and papain-like proteinase to reduce the false positive prediction rate. The improved system, ZCURVE_CoV 2.0, has been run for each of the 24 completely sequenced coronavirus genomes in GenBank. Consequently, all the non-structural proteins in the 24 genomes are accurately predicted. Compared with known annotations, the performance of the present method is satisfactory. The software ZCURVE_CoV 2.0 is freely available at http://tubic.tju.edu.cn/sars/.  相似文献   

13.
Among the picornaviridae, hepatitis A virus (HAV) is unique in that its assembly is driven by domain 2A of P1-2A, the precursor of the structural proteins (Probst, C., Jecht, M., and Gauss-Müller, V. (1999) J. Biol. Chem. 274, 4527-4531). Whereas infected individuals excrete in stool mature HAV capsids with VP1 as the major structural protein, its C-terminal extended form VP1-2A is the main component of immature procapsids produced in HAV-infected cells in culture. Obviously, a postassembly proteolytic step is required to remove the primary assembly signal 2A from VP1-2A of procapsids. Mutants of VP1-2A were expressed in COS7 cells to determine the cleavage site in VP1-2A and to test for the cleavage potential of viral and host proteinases (factor Xa and thrombin). Site-specific in vitro cleavage by factor Xa and thrombin occurred in procapsids that contained VP1-2A with engineered cognate cleavage sites for these proteinases. Interestingly, factor Xa but not thrombin liberated mature VP1 also from native procapsids in an assembly-dependent manner. The data show that domain 2A, which is required for pentamerization of its precursor polypeptides and thus for the primary step of HAV assembly, is removed from the surface of immature procapsid by a host proteinase. Moreover, our data open a novel avenue to produce homogeneous HAV particles from recombinant intermediates by in vitro treatment with exogenously added proteases such as factor Xa or thrombin.  相似文献   

14.
Several accurate prediction systems have been developed for prediction of class I major histocompatibility complex (MHC):peptide binding. Most of these are trained on binding affinity data of primarily 9mer peptides. Here, we show how prediction methods trained on 9mer data can be used for accurate binding affinity prediction of peptides of length 8, 10 and 11. The method gives the opportunity to predict peptides with a different length than nine for MHC alleles where no such peptides have been measured. As validation, the performance of this approach is compared to predictors trained on peptides of the peptide length in question. In this validation, the approximation method has an accuracy that is comparable to or better than methods trained on a peptide length identical to the predicted peptides. AVAILABILITY: The algorithm has been implemented in the web-accessible servers NetMHC-3.0: http://www.cbs.dtu.dk/services/NetMHC-3.0, and NetMHCpan-1.1: http://www.cbs.dtu.dk/services/NetMHCpan-1.1  相似文献   

15.
Post-translational modifications (PTMs) occur on almost all proteins analyzed to date. The function of a modified protein is often strongly affected by these modifications and therefore increased knowledge about the potential PTMs of a target protein may increase our understanding of the molecular processes in which it takes part. High-throughput methods for the identification of PTMs are being developed, in particular within the fields of proteomics and mass spectrometry. However, these methods are still in their early stages, and it is indeed advantageous to cut down on the number of experimental steps by integrating computational approaches into the validation procedures. Many advanced methods for the prediction of PTMs exist and many are made publicly available. We describe our experiences with the development of prediction methods for phosphorylation and glycosylation sites and the development of PTM-specific databases. In addition, we discuss novel ideas for PTM visualization (exemplified by kinase landscapes) and improvements for prediction specificity (by using ESS--evolutionary stable sites). As an example, we present a new method for kinase-specific prediction of phosphorylation sites, NetPhosK, which extends our earlier and more general tool, NetPhos. The new server, NetPhosK, is made publicly available at the URL http://www.cbs.dtu.dk/services/NetPhosK/. The issues of underestimation, over-prediction and strategies for improving prediction specificity are also discussed.  相似文献   

16.
Neural network predicts sequence of TP53 gene based on DNA chip   总被引:2,自引:0,他引:2  
We have trained an artificial neural network to predict the sequence of the human TP53 tumor suppressor gene based on a p53 GeneChip. The trained neural network uses as input the fluorescence intensities of DNA hybridized to oligonucleotides on the surface of the chip and makes between zero and four errors in the predicted 1300 bp sequence when tested on wild-type TP53 sequence. AVAILABILITY: The trained neural network is available for academic use by contacting steen@cbs.dtu.dk  相似文献   

17.
NetCGlyc 1.0: prediction of mammalian C-mannosylation sites   总被引:2,自引:0,他引:2  
Julenius K 《Glycobiology》2007,17(8):868-876
  相似文献   

18.
The proteins of flaviviruses are translated as a single long polyprotein which is co- and posttranslationally processed by both cellular and viral proteinases. We have studied the processing of flavivirus polyproteins in vitro by a viral proteinase located within protein NS3 that cleaves at least three sites within the nonstructural region of the polyprotein, acting primarily autocatalytically. Recombinant polyproteins in which part of the polyprotein is derived from yellow fever virus and part from dengue virus were used. We found that polyproteins containing the yellow fever virus cleavage sites were processed efficiently by the yellow fever virus enzyme, by the dengue virus enzyme, and by various chimeric enzymes. In contrast, dengue virus cleavage sites were cleaved inefficiently by the dengue virus enzyme and not at all by the yellow fever virus enzyme. Studies with chimeric proteinases and with site-directed mutants provided evidence for a direct interaction between the cleavage sites and the proposed substrate-binding pocket of the enzyme. We also found that the efficiency and order of processing could be altered by site-directed mutagenesis of the proposed substrate-binding pocket.  相似文献   

19.
Prediction of splice sites in non-coding regions of genes is one of the most challenging aspects of gene structure recognition. We perform a rigorous analysis of such splice sites embedded in human 5' untranslated regions (UTRs), and investigate correlations between this class of splice sites and other features found in the adjacent exons and introns. By restricting the training of neural network algorithms to 'pure' UTRs (not extending partially into protein coding regions), we for the first time investigate the predictive power of the splicing signal proper, in contrast to conventional splice site prediction, which typically relies on the change in sequence at the transition from protein coding to non-coding. By doing so, the algorithms were able to pick up subtler splicing signals that were otherwise masked by 'coding' noise, thus enhancing significantly the prediction of 5' UTR splice sites. For example, the non-coding splice site predicting networks pick up compositional and positional bias in the 3' ends of non-coding exons and 5' non-coding intron ends, where cytosine and guanine are over-represented. This compositional bias at the true UTR donor sites is also visible in the synaptic weights of the neural networks trained to identify UTR donor sites. Conventional splice site prediction methods perform poorly in UTRs because the reading frame pattern is absent. The NetUTR method presented here performs 2-3-fold better compared with NetGene2 and GenScan in 5' UTRs. We also tested the 5' UTR trained method on protein coding regions, and discovered, surprisingly, that it works quite well (although it cannot compete with NetGene2). This indicates that the local splicing pattern in UTRs and coding regions is largely the same. The NetUTR method is made publicly available at www.cbs.dtu.dk/services/NetUTR.  相似文献   

20.
Proteolytic cleavage of virus-specific proteins is a universal phenomenon, which is widely expanded among different viruses including bacterial, plant, animal, and human viruses. Proteolytic processing of viral proteins involves the cleavage in strictly specific sites (proteolytic sites) of polyprotein molecules. Specificity of this processing is a doubly dependent event controlled by the amino acids of proteolytic sites and the presence of adequate proteinases. Host-originated and/or virus-coded proteinases are known to perform the cleavage of viral polypeptides. Conformational and functional behaviour of many virus proteins is regulated by proteolytic modification; as a result, the reproduction of mature virions and the infection pathways are directly controlled. Molecular mechanisms of site-specific proteolytic processing of viral proteins are proposed as a target to be attacked for chemotherapeutic virus inhibition and to be modified for vaccine design. The approaches are analysed to realise this antiviral strategy, and prospects for its development are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号