首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background  

Predicting the subcellular localization of proteins is important for determining the function of proteins. Previous works focused on predicting protein localization in Gram-negative bacteria obtained good results. However, these methods had relatively low accuracies for the localization of extracellular proteins. This paper studies ways to improve the accuracy for predicting extracellular localization in Gram-negative bacteria.  相似文献   

2.
The function of a protein is intimately tied to its subcellular localization. Although localizations have been measured for many yeast proteins through systematic GFP fusions, similar studies in other branches of life are still forthcoming. In the interim, various machine-learning methods have been proposed to predict localization using physical characteristics of a protein, such as amino acid content, hydrophobicity, side-chain mass and domain composition. However, there has been comparatively little work on predicting localization using protein networks. Here, we predict protein localizations by integrating an extensive set of protein physical characteristics over a protein's extended protein-protein interaction neighborhood, using a classification framework called 'Divide and Conquer k-Nearest Neighbors' (DC-kNN). These predictions achieve significantly higher accuracy than two well-known methods for predicting protein localization in yeast. Using new GFP imaging experiments, we show that the network-based approach can extend and revise previous annotations made from high-throughput studies. Finally, we show that our approach remains highly predictive in higher eukaryotes such as fly and human, in which most localizations are unknown and the protein network coverage is less substantial.  相似文献   

3.
Here we report a systematic approach for predicting subcellular localization (cytoplasm, mitochondrial, nuclear, and plasma membrane) of human proteins. First, support vector machine (SVM)-based modules for predicting subcellular localization using traditional amino acid and dipeptide (i + 1) composition achieved overall accuracy of 76.6 and 77.8%, respectively. PSI-BLAST, when carried out using a similarity-based search against a nonredundant data base of experimentally annotated proteins, yielded 73.3% accuracy. To gain further insight, a hybrid module (hybrid1) was developed based on amino acid composition, dipeptide composition, and similarity information and attained better accuracy of 84.9%. In addition, SVM modules based on a different higher order dipeptide i.e. i + 2, i + 3, and i + 4 were also constructed for the prediction of subcellular localization of human proteins, and overall accuracy of 79.7, 77.5, and 77.1% was accomplished, respectively. Furthermore, another SVM module hybrid2 was developed using traditional dipeptide (i + 1) and higher order dipeptide (i + 2, i + 3, and i + 4) compositions, which gave an overall accuracy of 81.3%. We also developed SVM module hybrid3 based on amino acid composition, traditional and higher order dipeptide compositions, and PSI-BLAST output and achieved an overall accuracy of 84.4%. A Web server HSLPred (www.imtech.res.in/raghava/hslpred/ or bioinformatics.uams.edu/raghava/hslpred/) has been designed to predict subcellular localization of human proteins using the above approaches.  相似文献   

4.
MOTIVATION: Each protein performs its functions within some specific locations in a cell. This subcellular location is important for understanding protein function and for facilitating its purification. There are now many computational techniques for predicting location based on sequence analysis and database information from homologs. A few recent techniques use text from biological abstracts: our goal is to improve the prediction accuracy of such text-based techniques. We identify three techniques for improving text-based prediction: a rule for ambiguous abstract removal, a mechanism for using synonyms from the Gene Ontology (GO) and a mechanism for using the GO hierarchy to generalize terms. We show that these three techniques can significantly improve the accuracy of protein subcellular location predictors that use text extracted from PubMed abstracts whose references are recorded in Swiss-Prot.  相似文献   

5.
6.
The ability to predict the subcellular localization of a protein from its sequence is of great importance, as it provides information about the protein's function. We present a computational tool, PredSL, which utilizes neural networks, Markov chains, profile hidden Markov models, and scoring matrices for the prediction of the subcellular localization of proteins in eukaryotic cells from the N-terminal amino acid sequence. It aims to classify proteins into five groups: chloroplast, thylakoid, mitochondrion, secretory pathway, and "other". When tested in a fivefold cross-validation procedure, PredSL demonstrates 86.7% and 87.1% overall accuracy for the plant and non-plant datasets, respectively. Compared with TargetP, which is the most widely used method to date, and LumenP, the results of PredSL are comparable in most cases. When tested on the experimentally verified proteins of the Saccharomyces cerevisiae genome, PredSL performs comparably if not better than any available algorithm for the same task. Furthermore, PredSL is the only method capable for the prediction of these subcellular localizations that is available as a stand-alone application through the URL: http://bioinformatics.biol.uoa.gr/PredSL/.  相似文献   

7.
MOTIVATION: Functional annotation of unknown proteins is a major goal in proteomics. A key annotation is the prediction of a protein's subcellular localization. Numerous prediction techniques have been developed, typically focusing on a single underlying biological aspect or predicting a subset of all possible localizations. An important step is taken towards emulating the protein sorting process by capturing and bringing together biologically relevant information, and addressing the clear need to improve prediction accuracy and localization coverage. RESULTS: Here we present a novel SVM-based approach for predicting subcellular localization, which integrates N-terminal targeting sequences, amino acid composition and protein sequence motifs. We show how this approach improves the prediction based on N-terminal targeting sequences, by comparing our method TargetLoc against existing methods. Furthermore, MultiLoc performs considerably better than comparable methods predicting all major eukaryotic subcellular localizations, and shows better or comparable results to methods that are specialized on fewer localizations or for one organism. AVAILABILITY: http://www-bs.informatik.uni-tuebingen.de/Services/MultiLoc/  相似文献   

8.
We evaluated the efficiency of the best linear unbiased predictor (BLUP) and the influence of the use of similarity in state (SIS) and similarity by descent (SBD) in the prediction of untested maize hybrids. Nine inbred lines of maize were crossed using a randomized complete diallel method. These materials were genotyped with 48 microsatellite markers (SSR) associated with the QTL regions for grain yield. Estimates of four coefficients of SIS and four coefficients of SBD were used to construct the additive genetic and dominance matrices, which were later used in combination with the BLUP for predicting genotypic values and specific combining ability (SCA) in unanalyzed hybrids under simulated unbalance. The values of correlations between the genotypic values predicted and the means observed, depending on the degree of unbalance, ranged from 0.48 to 0.99 for SIS and 0.40 to 0.99 using information from SBD. The results obtained for the SCA ranged from 0.26 to 0.98 using the SIS and 0.001 to 0.990 using the SBD information. It was also observed that the predictions using SBD showed less biased than SIS predictions demonstrating that the predictions obtained by these coefficients (SBD) were closer to the observed value, but were less efficient in the ranking of genotypes. Although the SIS showed a bias due to overestimation of relatedness, this type of coefficient may be used where low values are detected in the SBD in the group of parents because of its greater efficiency in ranking the candidates hybrids.  相似文献   

9.
10.
The subcellular localization of a protein can provide important information about its function within the cell. As eukaryotic cells and particularly mammalian cells are characterized by a high degree of compartmentalization, most protein activities can be assigned to particular cellular compartments. The categorization of proteins by their subcellular localization is therefore one of the essential goals of the functional annotation of the human genome. We previously performed a subcellular localization screen of 52 proteins encoded on human chromosome 21. In the current study, we compared the experimental localization data to the in silico results generated by nine leading software packages with different prediction resolutions. The comparison revealed striking differences between the programs in the accuracy of their subcellular protein localization predictions. Our results strongly suggest that the recently developed predictors utilizing multiple prediction methods tend to provide significantly better performance over purely sequence-based or homology-based predictions.  相似文献   

11.
Protein tertiary structure prediction using a branch and bound algorithm   总被引:2,自引:0,他引:2  
We report a new method for predicting protein tertiary structure from sequence and secondary structure information. The predictions result from global optimization of a potential energy function, including van der Waals, hydrophobic, and excluded volume terms. The optimization algorithm, which is based on the alphaBB method developed by Floudas and coworkers (Costas and Floudas, J Chem Phys 1994;100:1247-1261), uses a reduced model of the protein and is implemented in both distance and dihedral angle space, enabling a side-by-side comparison of methodologies. For a set of eight small proteins, representing the three basic types--all alpha, all beta, and mixed alpha/beta--the algorithm locates low-energy native-like structures (less than 6A root mean square deviation from the native coordinates) starting from an unfolded state. Serial and parallel implementations of this methodology are discussed.  相似文献   

12.
Automated sequence annotation is a major goal of post-genomic era with hundreds of genomes in the databases, from both prokaryotes and eukaryotes. While the number of fully sequenced chromosomes from microbial organisms exponentially increased in the last decade above 600, presently we know the whole DNA content of only 25 eukaryotic organisms, including Homo sapiens. However, the process of genome annotation is far from being completed. This is particularly relevant in eukaryotes, whose cells contain several subcellular compartments, or organelles, enclosed by membranes, where different relevant functions are performed. Translocation across the membrane into the organelles is a highly regulated and complex cellular process. Indeed different proteins and/or protein isoforms, originated from genes by alternative splicing, may be conveyed to different cell compartments, depending on their specific role in the cell. During recent years the prediction of subcellular localization (SL) by computational means has been an active research area. Several methods are presently available based on different notions and addressing different aspects of SL. This review provides a short overview of the most well performing methods described in the literature, highlighting their predictive capabilities and different applications.  相似文献   

13.
We explored a novel approach to the functional regulation of nuclear proteins; altering their subcellular localization. To anchor a nuclear protein, beta-galactosidase with the nuclear localization signal of SV40 (nbeta-gal), within the cytoplasm, nbeta-gal was fused to the transmembrane domain of granulocyte colony-stimulating factor receptor (G-CSFR), a membrane protein. To liberate the nbeta-gal portion from the fusion protein, we used a protease derived from a plant virus, whose recognition sequence was inserted between the G-CSFR and nbeta-gal. Western analysis showed that the chimeric protein was cleaved in the presence of the protease in 293 cells and that the fusion protein without the recognition sequence remained intact. This chimeric protein was localized exclusively in the cytoplasm as visualized by X-gal staining and immunofluorescence microscopy. In contrast, when expressed together with the protease, beta-gal was predominantly detected in the nuclei. Moreover, we isolated 293-cell clones constitutively expressing the protease, indicating that this protease is not cytotoxic. These results suggest that the viral protease-mediated alteration of subcellular localization can potentially regulate the function of nuclear proteins.  相似文献   

14.
The uapC gene of Aspergillus nidulans belongs to a family of nucleobase-specific transporters conserved in prokaryotic and eucaryotic organisms. We report the use of immunological and green fluorescent protein based strategies to study protein expression and subcellular distribution of UapC. A chimeric protein containing a plant-adapted green fluorescent protein (sGFP) fused to the C-terminus of UapC was shown to be functional in vivo, as it complements a triple mutant (i.e., uapC(-) uapA(-) azgA(-)) unable to grow on uric acid as the sole nitrogen source. UapC-GFP is located in the plasma membrane and, secondarily, in internal structures observed as fluorescent dots. A strong correlation was found between cellular levels of UapC-GFP fluorescence and known patterns of uapC gene expression. This work represents the first in vivo study of protein expression and subcellular localization of a filamentous fungal nucleobase transporter.  相似文献   

15.
MOTIVATION: Knowing the localization of a protein within the cell helps elucidate its role in biological processes, its function and its potential as a drug target. Thus, subcellular localization prediction is an active research area. Numerous localization prediction systems are described in the literature; some focus on specific localizations or organisms, while others attempt to cover a wide range of localizations. RESULTS: We introduce SherLoc, a new comprehensive system for predicting the localization of eukaryotic proteins. It integrates several types of sequence and text-based features. While applying the widely used support vector machines (SVMs), SherLoc's main novelty lies in the way in which it selects its text sources and features, and integrates those with sequence-based features. We test SherLoc on previously used datasets, as well as on a new set devised specifically to test its predictive power, and show that SherLoc consistently improves on previous reported results. We also report the results of applying SherLoc to a large set of yet-unlocalized proteins. AVAILABILITY: SherLoc, along with Supplementary Information, is available at: http://www-bs.informatik.uni-tuebingen.de/Services/SherLoc/  相似文献   

16.

Background  

Knowing the subcellular location of proteins provides clues to their function as well as the interconnectivity of biological processes. Dozens of tools are available for predicting protein location in the eukaryotic cell. Each tool performs well on certain data sets, but their predictions often disagree for a given protein. Since the individual tools each have particular strengths, we set out to integrate them in a way that optimally exploits their potential. The method we present here is applicable to various subcellular locations, but tailored for predicting whether or not a protein is localized in mitochondria. Knowledge of the mitochondrial proteome is relevant to understanding the role of this organelle in global cellular processes.  相似文献   

17.
18.
A tool called Locfind for the sequence-based prediction of the localization of eukaryotic proteins is introduced. It is based on bidirectional recurrent neural networks trained to read sequentially the amino acid sequence and produce localization information along the sequence. Systematic variation of the network architecture in combination with an efficient learning algorithm lead to a 91% correct localization prediction for novel proteins in fivefold cross-validation. The data and evaluation procedure are the same as the non-plant part of the widely used TargetP tool by Emanuelsson et al. The Locfind system is available on the WWW for predictions (http://www.stepc.gr/~synaptic/locfind.html).  相似文献   

19.
20.
Monospecific antibodies were generated against each of six different peptide sequences derived from rat and human alpha-transforming growth factor (alpha-TGF). The affinity-purified antibody to the 17 amino acid carboxyl-terminal portion of the molecule proved most useful in detecting alpha-TGF. When used in a peptide-based radioimmunoassay, it was possible to measure nanogram quantities of native alpha-TGF in conditioned cell culture media. When used to analyze cell lysate, these antibodies specifically recognized a 21-kilodalton protein species. Indirect immunofluorescence localization procedures revealed a high concentration of alpha-TGF in a perinuclear ring with a diffuse cytoplasmic distribution. These results suggest that a precursor form of alpha-TGF has a cellular role beyond that of an autocrine growth factor.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号