首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 562 毫秒
1.
ABSTRACT: BACKGROUND: Understanding protein subcellular localization is a necessary component toward understanding the overall function of a protein. Numerous computational methods have been published over the past decade, with varying degrees of success. Despite the large number of published methods in this area, only a small fraction of them are available for researchers to use in their own studies. Of those that are available, many are limited by predicting only a small number of major organelles in the cell. Additionally, the majority of methods predict only a single location, even though it is known that a large fraction of the proteins in eukaryotic species shuttle between locations to carry out their function. FINDINGS: We present a software package and a web server for predicting subcellular localization of protein sequences based on the ngLOC method. ngLOC is an n-gram-based Bayesian classifier that predicts subcellular localization of proteins both in prokaryotes and eukaryotes. The overall prediction accuracy varies from 89.8% to 91.4% across species. This program can predict 11 distinct locations each in plant and animal species. ngLOC also predicts 4 and 5 distinct locations on gram-positive and gram-negative bacterial datasets, respectively. CONCLUSIONS: ngLOC is a generic method that can be trained by data from a variety of species or classes for predicting protein subcellular localization. The standalone software is freely available for academic use under GNU GPL, and the ngLOC web server is also accessible at http://ngloc.unmc.edu.  相似文献   

2.
MOTIVATION: There is a scarcity of efficient computational methods for predicting protein subcellular localization in eukaryotes. Currently available methods are inadequate for genome-scale predictions with several limitations. Here, we present a new prediction method, pTARGET that can predict proteins targeted to nine different subcellular locations in the eukaryotic animal species. RESULTS: The nine subcellular locations predicted by pTARGET include cytoplasm, endoplasmic reticulum, extracellular/secretory, golgi, lysosomes, mitochondria, nucleus, plasma membrane and peroxisomes. Predictions are based on the location-specific protein functional domains and the amino acid compositional differences across different subcellular locations. Overall, this method can predict 68-87% of the true positives at accuracy rates of 96-99%. Comparison of the prediction performance against PSORT showed that pTARGET prediction rates are higher by 11-60% in 6 of the 8 locations tested. Besides, the pTARGET method is robust enough for genome-scale prediction of protein subcellular localizations since, it does not rely on the presence of signal or target peptides. AVAILABILITY: A public web server based on the pTARGET method is accessible at the URL http://bioinformatics.albany.edu/~ptarget. Datasets used for developing pTARGET can be downloaded from this web server. Source code will be available on request from the corresponding author.  相似文献   

3.
Predicting the subcellular localization of proteins conquers the major drawbacks of high-throughput localization experiments that are costly and time-consuming. However, current subcellular localization predictors are limited in scope and accuracy. In particular, most predictors perform well on certain locations or with certain data sets while poorly on others. Here, we present PSI, a novel high accuracy web server for plant subcellular localization prediction. PSI derives the wisdom of multiple specialized predictors via a joint-approach of group decision making strategy and machine learning methods to give an integrated best result. The overall accuracy obtained (up to 93.4%) was higher than best individual (CELLO) by ∼10.7%. The precision of each predicable subcellular location (more than 80%) far exceeds that of the individual predictors. It can also deal with multi-localization proteins. PSI is expected to be a powerful tool in protein location engineering as well as in plant sciences, while the strategy employed could be applied to other integrative problems. A user-friendly web server, PSI, has been developed for free access at http://bis.zju.edu.cn/psi/.  相似文献   

4.
Protein subcellular localization is a major determinant of protein function. However, this important protein feature is often described in terms of discrete and qualitative categories of subcellular compartments, and therefore it has limited applications in quantitative protein function analyses. Here, we present Protein Localization Analysis and Search Tools (PLAST), an automated analysis framework for constructing and comparing quantitative signatures of protein subcellular localization patterns based on microscopy images. PLAST produces human-interpretable protein localization maps that quantitatively describe the similarities in the localization patterns of proteins and major subcellular compartments, without requiring manual assignment or supervised learning of these compartments. Using the budding yeast Saccharomyces cerevisiae as a model system, we show that PLAST is more accurate than existing, qualitative protein localization annotations in identifying known co-localized proteins. Furthermore, we demonstrate that PLAST can reveal protein localization-function relationships that are not obvious from these annotations. First, we identified proteins that have similar localization patterns and participate in closely-related biological processes, but do not necessarily form stable complexes with each other or localize at the same organelles. Second, we found an association between spatial and functional divergences of proteins during evolution. Surprisingly, as proteins with common ancestors evolve, they tend to develop more diverged subcellular localization patterns, but still occupy similar numbers of compartments. This suggests that divergence of protein localization might be more frequently due to the development of more specific localization patterns over ancestral compartments than the occupation of new compartments. PLAST enables systematic and quantitative analyses of protein localization-function relationships, and will be useful to elucidate protein functions and how these functions were acquired in cells from different organisms or species. A public web interface of PLAST is available at http://plast.bii.a-star.edu.sg.  相似文献   

5.
6.
Tantoso E  Li KB 《Amino acids》2008,35(2):345-353
Identifying a protein's subcellular localization is an important step to understand its function. However, the involved experimental work is usually laborious, time consuming and costly. Computational prediction hence becomes valuable to reduce the inefficiency. Here we provide a method to predict protein subcellular localization by using amino acid composition and physicochemical properties. The method concatenates the information extracted from a protein's N-terminal, middle and full sequence. Each part is represented by amino acid composition, weighted amino acid composition, five-level grouping composition and five-level dipeptide composition. We divided our dataset into training and testing set. The training set is used to determine the best performing amino acid index by using five-fold cross validation, whereas the testing set acts as the independent dataset to evaluate the performance of our model. With the novel representation method, we achieve an accuracy of approximately 75% on independent dataset. We conclude that this new representation indeed performs well and is able to extract the protein sequence information. We have developed a web server for predicting protein subcellular localization. The web server is available at http://aaindexloc.bii.a-star.edu.sg .  相似文献   

7.
Automated image analysis of protein localization in budding yeast   总被引:1,自引:0,他引:1  
MOTIVATION: The yeast Saccharomyces cerevisiae is the first eukaryotic organism to have its genome completely sequenced. Since then, several large-scale analyses of the yeast genome have provided extensive functional annotations of individual genes and proteins. One fundamental property of a protein is its subcellular localization, which provides critical information about how this protein works in a cell. An important project therefore was the creation of the yeast GFP fusion localization database by the University of California, San Francisco, USA (UCSF). This database provides localization data for 75% of the proteins believed to be encoded by the yeast genome. These proteins were classified into 22 distinct subcellular location categories by visual examination. Based on our past success at building automated systems to classify subcellular location patterns in mammalian cells, we sought to create a similar system for yeast. RESULTS: We developed computational methods to automatically analyze the images created by the UCSF yeast GFP fusion localization project. The system was trained to recognize the same location categories that were used in that study. We applied the system to 2640 images, and the system gave the same label as the previous assignments to 2139 images (81%). When only the highest confidence assignments were considered, 94.7% agreement was observed. Visual examination of the proteins for which the two approaches disagree suggests that at least some of the automated assignments may be more accurate. The automated method provides an objective, quantitative and repeatable assignment of protein locations that can be applied to new collections of yeast images (e.g. for different strains or the same strain under different conditions). It is also important to note that this performance could be achieved without requiring colocalization with any marker proteins. AVAILABILITY: The original images analyzed in this article are available at http://yeastgfp.ucsf.edu, and source code and results are available at http://murphylab.web.cmu.edu/software.  相似文献   

8.
MOTIVATION: Subcellular localization is a key functional characteristic of proteins. A fully automatic and reliable prediction system for protein subcellular localization is needed, especially for the analysis of large-scale genome sequences. RESULTS: In this paper, Support Vector Machine has been introduced to predict the subcellular localization of proteins from their amino acid compositions. The total prediction accuracies reach 91.4% for three subcellular locations in prokaryotic organisms and 79.4% for four locations in eukaryotic organisms. Predictions by our approach are robust to errors in the protein N-terminal sequences. This new approach provides superior prediction performance compared with existing algorithms based on amino acid composition and can be a complementary method to other existing methods based on sorting signals. AVAILABILITY: A web server implementing the prediction method is available at http://www.bioinfo.tsinghua.edu.cn/SubLoc/. SUPPLEMENTARY INFORMATION: Supplementary material is available at http://www.bioinfo.tsinghua.edu.cn/SubLoc/.  相似文献   

9.
Proper subcellular localization is critical for proteins to perform their roles in cellular functions. Proteins are transported by different cellular sorting pathways, some of which take a protein through several intermediate locations until reaching its final destination. The pathway a protein is transported through is determined by carrier proteins that bind to specific sequence motifs. In this article, we present a new method that integrates protein interaction and sequence motif data to model how proteins are sorted through these sorting pathways. We use a hidden Markov model (HMM) to represent protein sorting pathways. The model is able to determine intermediate sorting states and to assign carrier proteins and motifs to the sorting pathways. In simulation studies, we show that the method can accurately recover an underlying sorting model. Using data for yeast, we show that our model leads to accurate prediction of subcellular localization. We also show that the pathways learned by our model recover many known sorting pathways and correctly assign proteins to the path they utilize. The learned model identified new pathways and their putative carriers and motifs and these may represent novel protein sorting mechanisms. Supplementary results and software implementation are available from http://murphylab.web.cmu.edu/software/2010_RECOMB_pathways/.  相似文献   

10.
Cytokinins are ubiquitous plant hormones; their signal is perceived by sensor histidine kinases—cytokinin receptors. This review focuses on recent advances on cytokinin receptor structure, in particular sensing module and adjacent domains which play an important role in hormone recognition, signal transduction and receptor subcellular localization. Principles of cytokinin binding site organization and point mutations affecting signaling are discussed. To date, more than 100 putative cytokinin receptor genes from different plant species were revealed due to the total genome sequencing. This allowed us to employ an evolutionary and bioinformatics approaches to clarify some new aspects of receptor structure and function. Non-transmembrane areas adjacent to the ligand-binding CHASE domain were characterized in detail and new conserved protein motifs were recovered. Putative mechanisms for cytokinin-triggered receptor activation were suggested.  相似文献   

11.
Lee K  Kim DW  Na D  Lee KH  Lee D 《Nucleic acids research》2006,34(17):4655-4666
Subcellular localization is one of the key functional characteristics of proteins. An automatic and efficient prediction method for the protein subcellular localization is highly required owing to the need for large-scale genome analysis. From a machine learning point of view, a dataset of protein localization has several characteristics: the dataset has too many classes (there are more than 10 localizations in a cell), it is a multi-label dataset (a protein may occur in several different subcellular locations), and it is too imbalanced (the number of proteins in each localization is remarkably different). Even though many previous works have been done for the prediction of protein subcellular localization, none of them tackles effectively these characteristics at the same time. Thus, a new computational method for protein localization is eventually needed for more reliable outcomes. To address the issue, we present a protein localization predictor based on D-SVDD (PLPD) for the prediction of protein localization, which can find the likelihood of a specific localization of a protein more easily and more correctly. Moreover, we introduce three measurements for the more precise evaluation of a protein localization predictor. As the results of various datasets which are made from the experiments of Huh et al. (2003), the proposed PLPD method represents a different approach that might play a complimentary role to the existing methods, such as Nearest Neighbor method and discriminate covariant method. Finally, after finding a good boundary for each localization using the 5184 classified proteins as training data, we predicted 138 proteins whose subcellular localizations could not be clearly observed by the experiments of Huh et al. (2003).  相似文献   

12.
Guo J  Lin Y  Liu X 《Proteomics》2006,6(19):5099-5105
This paper proposes a new integrative system (GNBSL--Gram-negative bacteria subcellular localization) for subcellular localization specifized on the Gram-negative bacteria proteins. First, the system generates a position-specific frequency matrix (PSFM) and a position-specific scoring matrix (PSSM) for each protein sequence by searching the Swiss-Prot database. Then different features are extracted by four modules from the PSFM and the PSSM. The features include whole-sequence amino acid composition, N- and C-terminus amino acid composition, dipeptide composition, and segment composition. Four probabilistic neural network (PNN) classifiers are used to classify these modules. To further improve the performance, two modules trained by support vector machine (SVM) are added in this system. One module extracts the residue-couple distribution from the amino acid sequence and the other module applies a pairwise profile alignment kernel to measure the local similarity between every two sequences. Finally, an additional SVM is used to fuse the outputs from the six modules. Test on a benchmark dataset shows that the overall success rate of GNBSL is higher than those of PSORT-B, CELLO, and PSLpred. A web server GNBSL can be visited from http://166.111.24.5/webtools/GNBSL/index.htm.  相似文献   

13.
Background information. Precise localization of proteins to specialized subcellular domains is fundamental for proper neuronal development and function. The neural microtubule‐regulatory phosphoproteins of the stathmin family are such proteins whose specific functions are controlled by subcellular localization. Whereas stathmin is cytosolic, SCG10, SCLIP and RB3/RB3′/RB3″ are localized to the Golgi and vesicle‐like structures along neurites and at growth cones. We examined the molecular determinants involved in the regulation of this specific subcellular localization in hippocampal neurons in culture. Results. We show that their conserved N‐terminal domain A carrying two palmitoylation sites is dominant over the others for Golgi and vesicle‐like localization. Using palmitoylation‐deficient GFP (green fluorescent protein) fusion mutants, we demonstrate that domains A of stathmin proteins have the particular ability to control protein targeting to either Golgi or mitochondria, depending on their palmitoylation. This regulation involves the co‐operation of two subdomains within domain A, and seems also to be under the control of its SLD (stathmin‐like domain) extension. Conclusions. Our results unravel that, in specific biological conditions, palmitoylation of stathmin proteins might be able to control their targeting to express their functional activities at appropriate subcellular sites. They, more generally, open new perspectives regarding the role of palmitoylation as a signalling mechanism orienting proteins to their functional subcellular compartments.  相似文献   

14.
Plant membrane proteome databases   总被引:6,自引:0,他引:6  
In all living organisms transmembrane (TM) proteins are crucially involved in many physiological processes and constitute 20-30% of the proteome. An important class of TM proteins are transporters that interconnect biochemical pathways across the plasma membrane and intracellular membranes, e.g. the mitochondrial membranes and chloroplast envelope membranes. In recent years, bioinformatical tools to predict TM domains and subcellular localization were developed and used to analyze the first complete plant genomes of Arabidopsis and rice. This review focuses on plant TM proteome databases that compile topology and intracellular targeting predictions and different kinds of experimental data. In addition, other web sites are discussed that contribute useful experimental and/or bioinformatical data.  相似文献   

15.
16.
17.
The Drosophila protein HP1 is a 206 amino acid heterochromatin- associated nonhistone chromosomal protein. Based on the characterization of HP1 to date, there are three properties intrinsic to HP1: nuclear localization, heterochromatin binding, and gene silencing. In this work, we have concentrated on the identification of domains responsible for the nuclear localization and heterochromatin binding properties of HP1. We have expressed a series of beta- galactosidase/HP1 fusion proteins in Drosophila embryos and polytene tissue and have used beta-galactosidase enzymatic activity to identify the subcellular localization of each fusion protein. We have identified two functional domains in HP1: a nuclear localization domain of amino acids 152-206 and a heterochromatin binding domain of amino acids 95- 206. Both of these functional domains overlap an evolutionarily conserved COOH-terminal region.  相似文献   

18.
A kinase anchoring proteins (AKAPs) assemble and compartmentalize multiprotein signaling complexes at discrete subcellular locales and thus confer specificity to transduction cascades using ubiquitous signaling enzymes, such as protein kinase A. Intrinsic targeting domains in each AKAP determine the subcellular localization of these complexes and, along with protein-protein interaction domains, form the core of AKAP function. As a foundational step toward elucidating the relationship between location and function, we have used cross-species sequence analysis and deletion mapping to facilitate the identification of the targeting determinants of AKAP12 (also known as SSeCKS or Gravin). Three charged residue-rich regions were identified that regulate two aspects of AKAP12 localization, nuclear/cytoplasmic partitioning and perinuclear/cell periphery targeting. Using deletion mapping and green fluorescent protein chimeras, we uncovered a heretofore unrecognized nuclear localization potential. Five nuclear localization signals, including a novel class of this type of signal termed X2-NLS, are found in the central region of AKAP12 and are important for nuclear targeting. However, this nuclear localization is suppressed by the negatively charged C terminus that mediates nuclear exclusion. In this condition, the distribution of AKAP12 is regulated by an N-terminal targeting domain that simultaneously directs perinuclear and peripheral AKAP12 localization. Three basic residue-rich regions in the N-terminal targeting region have similarity to the MARCKS proteins and were found to control AKAP12 localization to ganglioside-rich regions at the cell periphery. Our data suggest that AKAP12 localization is regulated by a hierarchy of targeting domains and that the localization of AKAP12-assembled signaling complexes may be dynamically regulated.  相似文献   

19.
The subcellular localization of a protein is important for its proper function. Escherichia coli MinE is a small protein with clear subcellular localization, which provides a good model to study protein localization mechanism. In the present study, a series of recombinant minEs truncated in one end or in the middle regions, fused with egfp, was constructed, and these recombinant proteins could compete to function with the chromosomal MinE. Our results showed that the sequences related to the subcellular localization of MinE span several functional domains, demonstrating that MinE positioning in cells depends on multiple factors. The eGFP fusions with some truncated MinE from N-terminal resulted in different cell phenotypes and localization features, implying that these fusions can interfere chromosomal MinE’s function, similar to MinE36–88 phenotype in the previous report. The amino acid in the region (32–48) is sensitive to change MinE conformation and influence its dimerization. Some truncated protein structure could be unstable. Thus, the MinE localization is prerequisite for its proper anti-MinCD function and some new features of MinE were demonstrated. This approach can be extended for subcellular localization research for other essential proteins.  相似文献   

20.
We describe a streamlined and systematic method for cloning green fluorescent protein (GFP)-open reading frame (ORF) fusions and assessing their subcellular localization in Arabidopsis thaliana cells. The sequencing of the Arabidopsis genome has made it feasible to undertake genome-based approaches to determine the function of each protein and define its subcellular localization. This is an essential step towards full functional analysis. The approach described here allows the economical handling of hundreds of expressed plant proteins in a timely fashion. We have integrated recombinational cloning of full-length trimmed ORF clones (available from the SSP consortium) with high-efficiency transient transformation of Arabidopsis cell cultures by a hypervirulent strain of Agrobacterium. To demonstrate its utility, we have used a selection of trimmed ORFs, representing a variety of key cellular processes and have defined the localization patterns of 155 fusion proteins. These patterns have been classified into five main categories, including cytoplasmic, nuclear, nucleolar, organellar and endomembrane compartments. Several genes annotated in GenBank as unknown have been ascribed a protein localization pattern. We also demonstrate the application of flow cytometry to estimate the transformation efficiency and cell cycle phase of the GFP-positive cells. This approach can be extended to functional studies, including the precise cellular localization and the prediction of the role of unknown proteins, the confirmation of bioinformatic predictions and proteomic experiments, such as the determination of protein interactions in vivo, and therefore has numerous applications in the post-genomic analysis of protein function.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号