The eukaryotic cell has an intricate architecture with compartments and substructures dedicated to particular biological processes. Knowing the subcellular location of proteins not only indicates how bio-processes are organized in different cellular compartments, but also contributes to unravelling the function of individual proteins. Computational localization prediction is possible based on sequence information alone, and has been successfully applied to proteins from virtually all subcellular compartments and all domains of life. However, we realized that current prediction tools do not perform well on partial protein sequences such as those inferred from Expressed Sequence Tag (EST) data, limiting the exploitation of the large and taxonomically most comprehensive body of sequence information from eukaryotes.  相似文献   

SUMMARY: We developed a web server PSLpred for predicting subcellular localization of gram-negative bacterial proteins with an overall accuracy of 91.2%. PSLpred is a hybrid approach-based method that integrates PSI-BLAST and three SVM modules based on compositions of residues, dipeptides and physico-chemical properties. The prediction accuracies of 90.7, 86.8, 90.3, 95.2 and 90.6% were attained for cytoplasmic, extracellular, inner-membrane, outer-membrane and periplasmic proteins, respectively. Furthermore, PSLpred was able to predict approximately 74% of sequences with an average prediction accuracy of 98% at RI = 5. AVAILABILITY: PSLpred is available at http://www.imtech.res.in/raghava/pslpred/  相似文献   

Automated sequence annotation is a major goal of post-genomic era with hundreds of genomes in the databases, from both prokaryotes and eukaryotes. While the number of fully sequenced chromosomes from microbial organisms exponentially increased in the last decade above 600, presently we know the whole DNA content of only 25 eukaryotic organisms, including Homo sapiens. However, the process of genome annotation is far from being completed. This is particularly relevant in eukaryotes, whose cells contain several subcellular compartments, or organelles, enclosed by membranes, where different relevant functions are performed. Translocation across the membrane into the organelles is a highly regulated and complex cellular process. Indeed different proteins and/or protein isoforms, originated from genes by alternative splicing, may be conveyed to different cell compartments, depending on their specific role in the cell. During recent years the prediction of subcellular localization (SL) by computational means has been an active research area. Several methods are presently available based on different notions and addressing different aspects of SL. This review provides a short overview of the most well performing methods described in the literature, highlighting their predictive capabilities and different applications.  相似文献   

蛋白质序列的编码是亚细胞定位预测问题中的关键技术之一。该文较为详细地介绍了目前已有的蛋白质序列编码算法;并指出了序列编码中存在的一些问题及可能的发展方向。  相似文献   

Subcellular localization is an important protein property, which is related to function, interactions and other features. As experimental determination of the localization can be tedious, especially for large numbers of proteins, a number of prediction tools have been developed. We developed the PROlocalizer service that integrates 11 individual methods to predict altogether 12 localizations for animal proteins. The method allows the submission of a number of proteins and mutations and generates a detailed informative document of the prediction and obtained results. PROlocalizer is available at .  相似文献   

Automated prediction of bacterial protein subcellular localization is an important tool for genome annotation and drug discovery. PSORT has been one of the most widely used computational methods for such bacterial protein analysis; however, it has not been updated since it was introduced in 1991. In addition, neither PSORT nor any of the other computational methods available make predictions for all five of the localization sites characteristic of Gram-negative bacteria. Here we present PSORT-B, an updated version of PSORT for Gram-negative bacteria, which is available as a web-based application at http://www.psort.org. PSORT-B examines a given protein sequence for amino acid composition, similarity to proteins of known localization, presence of a signal peptide, transmembrane alpha-helices and motifs corresponding to specific localizations. A probabilistic method integrates these analyses, returning a list of five possible localization sites with associated probability scores. PSORT-B, designed to favor high precision (specificity) over high recall (sensitivity), attained an overall precision of 97% and recall of 75% in 5-fold cross-validation tests, using a dataset we developed of 1443 proteins of experimentally known localization. This dataset, the largest of its kind, is freely available, along with the PSORT-B source code (under GNU General Public License).  相似文献   

The subcellular localization of a protein can provide important information about its function within the cell. As eukaryotic cells and particularly mammalian cells are characterized by a high degree of compartmentalization, most protein activities can be assigned to particular cellular compartments. The categorization of proteins by their subcellular localization is therefore one of the essential goals of the functional annotation of the human genome. We previously performed a subcellular localization screen of 52 proteins encoded on human chromosome 21. In the current study, we compared the experimental localization data to the in silico results generated by nine leading software packages with different prediction resolutions. The comparison revealed striking differences between the programs in the accuracy of their subcellular protein localization predictions. Our results strongly suggest that the recently developed predictors utilizing multiple prediction methods tend to provide significantly better performance over purely sequence-based or homology-based predictions.  相似文献   

Proteins on the move: insights gained from fluorescent protein technologies   总被引:1,自引:0,他引:1  
Proteins are always on the move, and this may occur through diffusion or active transport. The realization that the regulation of signal transduction is highly dynamic in space and time has stimulated intense interest in the movement of proteins. Over the past decade, numerous new technologies using fluorescent proteins have been developed, allowing us to observe the spatiotemporal dynamics of proteins in living cells. These technologies have greatly advanced our understanding of protein dynamics, including protein movement and protein interactions.  相似文献   

MOTIVATION: Knowing the localization of a protein within the cell helps elucidate its role in biological processes, its function and its potential as a drug target. Thus, subcellular localization prediction is an active research area. Numerous localization prediction systems are described in the literature; some focus on specific localizations or organisms, while others attempt to cover a wide range of localizations. RESULTS: We introduce SherLoc, a new comprehensive system for predicting the localization of eukaryotic proteins. It integrates several types of sequence and text-based features. While applying the widely used support vector machines (SVMs), SherLoc's main novelty lies in the way in which it selects its text sources and features, and integrates those with sequence-based features. We test SherLoc on previously used datasets, as well as on a new set devised specifically to test its predictive power, and show that SherLoc consistently improves on previous reported results. We also report the results of applying SherLoc to a large set of yet-unlocalized proteins. AVAILABILITY: SherLoc, along with Supplementary Information, is available at: http://www-bs.informatik.uni-tuebingen.de/Services/SherLoc/  相似文献   



Protein subcellular localization is an important determinant of protein function and hence, reliable methods for prediction of localization are needed. A number of prediction algorithms have been developed based on amino acid compositions or on the N-terminal characteristics (signal peptides) of proteins. However, such approaches lead to a loss of contextual information. Moreover, where information about the physicochemical properties of amino acids has been used, the methods employed to exploit that information are less than optimal and could use the information more effectively.  相似文献   

The ability to predict the subcellular localization of a protein from its sequence is of great importance, as it provides information about the protein's function. We present a computational tool, PredSL, which utilizes neural networks, Markov chains, profile hidden Markov models, and scoring matrices for the prediction of the subcellular localization of proteins in eukaryotic cells from the N-terminal amino acid sequence. It aims to classify proteins into five groups: chloroplast, thylakoid, mitochondrion, secretory pathway, and "other". When tested in a fivefold cross-validation procedure, PredSL demonstrates 86.7% and 87.1% overall accuracy for the plant and non-plant datasets, respectively. Compared with TargetP, which is the most widely used method to date, and LumenP, the results of PredSL are comparable in most cases. When tested on the experimentally verified proteins of the Saccharomyces cerevisiae genome, PredSL performs comparably if not better than any available algorithm for the same task. Furthermore, PredSL is the only method capable for the prediction of these subcellular localizations that is available as a stand-alone application through the URL: http://bioinformatics.biol.uoa.gr/PredSL/.  相似文献   

Predicting subcellular localization of human proteins is a challenging problem, particularly when query proteins may have a multiplex character, i.e., simultaneously exist at, or move between, two or more different subcellular location sites. In a previous study, we developed a predictor called “Hum-mPLoc” to deal with the multiplex problem for the human protein system. However, Hum-mPLoc has the following shortcomings. (1) The input of accession number for a query protein is required in order to obtain a higher expected success rate by selecting to use the higher-level prediction pathway; but many proteins, such as synthetic and hypothetical proteins as well as those newly discovered proteins without being deposited into databanks yet, do not have accession numbers. (2) Neither functional domain nor sequential evolution information were taken into account in Hum-mPLoc, and hence its power may be reduced accordingly. In view of this, a top-down strategy to address these shortcomings has been implemented. The new predictor thus obtained is called Hum-mPLoc 2.0, where the accession number for input is no longer needed whatsoever. Moreover, both the functional domain information and the sequential evolution information have been fused into the predictor by an ensemble classifier. As a consequence, the prediction power has been significantly enhanced. The web server of Hum-mPLoc2.0 is freely accessible at http://www.csbio.sjtu.edu.cn/bioinf/hum-multi-2/.  相似文献   



Knowledge of subcellular localization of proteins is crucial to proteomics, drug target discovery and systems biology since localization and biological function are highly correlated. In recent years, numerous computational prediction methods have been developed. Nevertheless, there is still a need for prediction methods that show more robustness and higher accuracy.  相似文献   

以500个茶(Camellia sinensis(L.)O.Ktze.)叶片的蛋白质作为数据集,比较TargetP、WoLF PSORT、LocTree和Plant-mPLoc 4种软件预测亚细胞定位的可信度和灵敏度。结果显示,4种软件预测可信度均高于80%,依次排序为TargetP > LocTree > WoLF PSORT > Plant-mPLoc。其中,LocTree对细胞质蛋白和分泌蛋白检测灵敏度最高,但对叶绿体蛋白灵敏度最低;Plant-mPLoc检测核蛋白最灵敏,但对细胞质蛋白最不敏感;TargetP检测叶绿体蛋白最灵敏,但仅能区分3个亚细胞器官;WoLF PSORT对分泌蛋白检测灵敏度最低,但对其他蛋白均较灵敏。基于上述结果,该研究针对4种软件提出了合理的使用建议。  相似文献   



Gene Ontology (GO) annotation, which describes the function of genes and gene products across species, has recently been used to predict protein subcellular and subnuclear localization. Existing GO-based prediction methods for protein subcellular localization use the known accession numbers of query proteins to obtain their annotated GO terms. An accurate prediction method for predicting subcellular localization of novel proteins without known accession numbers, using only the input sequence, is worth developing.  相似文献   



Knowing the subcellular location of proteins provides clues to their function as well as the interconnectivity of biological processes. Dozens of tools are available for predicting protein location in the eukaryotic cell. Each tool performs well on certain data sets, but their predictions often disagree for a given protein. Since the individual tools each have particular strengths, we set out to integrate them in a way that optimally exploits their potential. The method we present here is applicable to various subcellular locations, but tailored for predicting whether or not a protein is localized in mitochondria. Knowledge of the mitochondrial proteome is relevant to understanding the role of this organelle in global cellular processes.  相似文献   

Axons as computing devices: basic insights gained from models.   总被引:2,自引:0,他引:2  
Detailed models of single neurons are typically focused on the dendritic tree and ignore the axonal tree, assuming that the axon is a simple transmission line. In the last 40 years, however, several theoretical and experimental studies have suggested that axons could implement information processing tasks by exploiting: 1) the time delay in action potential (AP) propagation along the axon; 2) the differential filtering of APs into the axonal subtrees; and 3) their activity-dependent excitability. Models for axonal trees have attempted to examine the feasibility of these ideas. However, because the physiological and anatomical data on axons are seriously limited, realistic models of axons have not been developed. The present paper summarizes the main insights that were gained from simplified models of axons; it also highlights the stochastic nature of axons, a topic that was largely neglected in classical models of axons. The advance of new experimental techniques makes it now possible to pay a very close experimental visit to axons. Theoretical tools and fast computers enable to go beyond the simplified models and to construct realistic models of axons. When tightly linked, experiments and theory will help to unravel how axons share the information processing tasks that single neurons implement.  相似文献   

Chang JM  Su EC  Lo A  Chiu HS  Sung TY  Hsu WL 《Proteins》2008,72(2):693-710
Prediction of protein subcellular localization (PSL) is important for genome annotation, protein function prediction, and drug discovery. Many computational approaches for PSL prediction based on protein sequences have been proposed in recent years for Gram-negative bacteria. We present PSLDoc, a method based on gapped-dipeptides and probabilistic latent semantic analysis (PLSA) to solve this problem. A protein is considered as a term string composed by gapped-dipeptides, which are defined as any two residues separated by one or more positions. The weighting scheme of gapped-dipeptides is calculated according to a position specific score matrix, which includes sequence evolutionary information. Then, PLSA is applied for feature reduction, and reduced vectors are input to five one-versus-rest support vector machine classifiers. The localization site with the highest probability is assigned as the final prediction. It has been reported that there is a strong correlation between sequence homology and subcellular localization (Nair and Rost, Protein Sci 2002;11:2836-2847; Yu et al., Proteins 2006;64:643-651). To properly evaluate the performance of PSLDoc, a target protein can be classified into low- or high-homology data sets. PSLDoc's overall accuracy of low- and high-homology data sets reaches 86.84% and 98.21%, respectively, and it compares favorably with that of CELLO II (Yu et al., Proteins 2006;64:643-651). In addition, we set a confidence threshold to achieve a high precision at specified levels of recall rates. When the confidence threshold is set at 0.7, PSLDoc achieves 97.89% in precision which is considerably better than that of PSORTb v.2.0 (Gardy et al., Bioinformatics 2005;21:617-623). Our approach demonstrates that the specific feature representation for proteins can be successfully applied to the prediction of protein subcellular localization and improves prediction accuracy. Besides, because of the generality of the representation, our method can be extended to eukaryotic proteomes in the future. The web server of PSLDoc is publicly available at http://bio-cluster.iis.sinica.edu.tw/~ bioapp/PSLDoc/.  相似文献   

Subcellular proteomics, as an important step to functional proteomics, has been a focus in proteomic research. However, the co-purification of "contaminating" proteins has been the major problem in all the subcellular proteomic research including all kinds of mitochondrial proteome research. It is often difficult to conclude whether these "contaminants" represent true endogenous partners or artificial associations induced by cell disruption or incomplete purification. To solve such a problem, we applied a high-throughput comparative proteome experimental strategy, ICAT approach performed with two-dimensional LC-MS/MS analysis, coupled with combinational usage of different bioinformatics tools, to study the proteome of rat liver mitochondria prepared with traditional centrifugation (CM) or further purified with a Nycodenz gradient (PM). A total of 169 proteins were identified and quantified convincingly in the ICAT analysis, in which 90 proteins have an ICAT ratio of PM:CM>1.0, while another 79 proteins have an ICAT ratio of PM:CM<1.0. Almost all the proteins annotated as mitochondrial according to Swiss-Prot annotation, bioinformatics prediction, and literature reports have a ratio of PM:CM>1.0, while proteins annotated as extracellular or secreted, cytoplasmic, endoplasmic reticulum, ribosomal, and so on have a ratio of PM:CM<1.0. Catalase and AP endonuclease 1, which have been known as peroxisomal and nuclear, respectively, have shown a ratio of PM:CM>1.0, confirming the reports about their mitochondrial location. Moreover, the 125 proteins with subcellular location annotation have been used as a testing dataset to evaluate the efficiency for ascertaining mitochondrial proteins by ICAT analysis and the bioinformatics tools such as PSORT, TargetP, SubLoc, MitoProt, and Predotar. The results indicated that ICAT analysis coupled with combinational usage of different bioinformatics tools could effectively ascertain mitochondrial proteins and distinguish contaminant proteins and even multilocation proteins. Using such a strategy, many novel proteins, known proteins without subcellular location annotation, and even known proteins that have been annotated as other locations have been strongly indicated for their mitochondrial location.  相似文献   

