首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Sequence conserved for subcellular localization   总被引:6,自引:0,他引:6       下载免费PDF全文
The more proteins diverged in sequence, the more difficult it becomes for bioinformatics to infer similarities of protein function and structure from sequence. The precise thresholds used in automated genome annotations depend on the particular aspect of protein function transferred by homology. Here, we presented the first large-scale analysis of the relation between sequence similarity and identity in subcellular localization. Three results stood out: (1) The subcellular compartment is generally more conserved than what might have been expected given that short sequence motifs like nuclear localization signals can alter the native compartment; (2) the sequence conservation of localization is similar between different compartments; and (3) it is similar to the conservation of structure and enzymatic activity. In particular, we found the transition between the regions of conserved and nonconserved localization to be very sharp, although the thresholds for conservation were less well defined than for structure and enzymatic activity. We found that a simple measure for sequence similarity accounting for pairwise sequence identity and alignment length, the HSSP distance, distinguished accurately between protein pairs of identical and different localizations. In fact, BLAST expectation values outperformed the HSSP distance only for alignments in the subtwilight zone. We succeeded in slightly improving the accuracy of inferring localization through homology by fine tuning the thresholds. Finally, we applied our results to the entire SWISS-PROT database and five entirely sequenced eukaryotes.  相似文献   

2.
《Genomics》2019,111(6):1831-1838
Knowing the protein localization can provide valuable information resource for elucidating protein function. In recent years, with the advances of human genomics and proteomics, it is possible to characterize human proteins that are located in different subcellular localizations. In this study, we used the topological properties and biological properties to characterize human proteins with six subcellular localizations. Almost all of these properties were found to be significantly different among six protein categories. Network topology analysis indicated that several significant topological properties, including the degree and k-core, were higher for the mitochondrial proteins. Biological property analysis showed that the nuclear proteins appeared to be correlated with important biological function. We hope these findings may provide some important help for comprehensive understanding the biological function of proteins, and prediction of protein subcellular localizations in human.  相似文献   

3.
Knowing the comprehensive knowledge about the protein subcellular localization is an important step to understand the function of the proteins. Recent advances in system biology have allowed us to develop more accurate methods for characterizing the proteins at subcellular localization level. In this study, the analysis method was developed to characterize the topological properties and biological properties of the cytoplasmic proteins, inner membrane proteins, outer membrane proteins and periplasmic proteins in Escherichia coli (E. coli). Statistical significant differences were found in all topological properties and biological properties among proteins in different subcellular localizations. In addition, investigation was carried out to analyze the differences in 20 amino acid compositions for four protein categories. We also found that there were significant differences in all of the 20 amino acid compositions. These findings may be helpful for understanding the comprehensive relationship between protein subcellular localization and biological function  相似文献   

4.
MOTIVATION: Infectious diseases such as malaria result in millions of deaths each year. An important aspect of any host-pathogen system is the mechanism by which a pathogen can infect its host. One method of infection is via protein-protein interactions (PPIs) where pathogen proteins target host proteins. Developing computational methods that identify which PPIs enable a pathogen to infect a host has great implications in identifying potential targets for therapeutics. RESULTS: We present a method that integrates known intra-species PPIs with protein-domain profiles to predict PPIs between host and pathogen proteins. Given a set of intra-species PPIs, we identify the functional domains in each of the interacting proteins. For every pair of functional domains, we use Bayesian statistics to assess the probability that two proteins with that pair of domains will interact. We apply our method to the Homo sapiens-Plasmodium falciparum host-pathogen system. Our system predicts 516 PPIs between proteins from these two organisms. We show that pairs of human proteins we predict to interact with the same Plasmodium protein are close to each other in the human PPI network and that Plasmodium pairs predicted to interact with same human protein are co-expressed in DNA microarray datasets measured during various stages of the Plasmodium life cycle. Finally, we identify functionally enriched sub-networks spanned by the predicted interactions and discuss the plausibility of our predictions. AVAILABILITY: Supplementary data are available at http://staff.vbi.vt.edu/dyermd/publications/dyer2007a.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

5.
Lee K  Kim DW  Na D  Lee KH  Lee D 《Nucleic acids research》2006,34(17):4655-4666
Subcellular localization is one of the key functional characteristics of proteins. An automatic and efficient prediction method for the protein subcellular localization is highly required owing to the need for large-scale genome analysis. From a machine learning point of view, a dataset of protein localization has several characteristics: the dataset has too many classes (there are more than 10 localizations in a cell), it is a multi-label dataset (a protein may occur in several different subcellular locations), and it is too imbalanced (the number of proteins in each localization is remarkably different). Even though many previous works have been done for the prediction of protein subcellular localization, none of them tackles effectively these characteristics at the same time. Thus, a new computational method for protein localization is eventually needed for more reliable outcomes. To address the issue, we present a protein localization predictor based on D-SVDD (PLPD) for the prediction of protein localization, which can find the likelihood of a specific localization of a protein more easily and more correctly. Moreover, we introduce three measurements for the more precise evaluation of a protein localization predictor. As the results of various datasets which are made from the experiments of Huh et al. (2003), the proposed PLPD method represents a different approach that might play a complimentary role to the existing methods, such as Nearest Neighbor method and discriminate covariant method. Finally, after finding a good boundary for each localization using the 5184 classified proteins as training data, we predicted 138 proteins whose subcellular localizations could not be clearly observed by the experiments of Huh et al. (2003).  相似文献   

6.

Background

The fungal pathogen Fusarium graminearum (telomorph Gibberella zeae) is the causal agent of several destructive crop diseases, where a set of genes usually work in concert to cause diseases to crops. To function appropriately, the F. graminearum proteins inside one cell should be assigned to different compartments, i.e. subcellular localizations. Therefore, the subcellular localizations of F. graminearum proteins can provide insights into protein functions and pathogenic mechanisms of this destructive pathogen fungus. Unfortunately, there are no subcellular localization information for F. graminearum proteins available now. Computational approaches provide an alternative way to predicting F. graminearum protein subcellular localizations due to the expensive and time-consuming biological experiments in lab.

Results

In this paper, we developed a novel predictor, namely FGsub, to predict F. graminearum protein subcellular localizations from the primary structures. First, a non-redundant fungi data set with subcellular localization annotation is collected from UniProtKB database and used as training set, where the subcellular locations are classified into 10 groups. Subsequently, Support Vector Machine (SVM) is trained on the training set and used to predict F. graminearum protein subcellular localizations for those proteins that do not have significant sequence similarity to those in training set. The performance of SVMs on training set with 10-fold cross-validation demonstrates the efficiency and effectiveness of the proposed method. In addition, for F. graminearum proteins that have significant sequence similarity to those in training set, BLAST is utilized to transfer annotations of homologous proteins to uncharacterized F. graminearum proteins so that the F. graminearum proteins are annotated more comprehensively.

Conclusions

In this work, we present FGsub to predict F. graminearum protein subcellular localizations in a comprehensive manner. We make four fold contributions to this filed. First, we present a new algorithm to cope with imbalance problem that arises in protein subcellular localization prediction, which can solve imbalance problem and avoid false positive results. Second, we design an ensemble classifier which employs feature selection to further improve prediction accuracy. Third, we use BLAST to complement machine learning based methods, which enlarges our prediction coverage. Last and most important, we predict the subcellular localizations of 12786 F. graminearum proteins, which provide insights into protein functions and pathogenic mechanisms of this destructive pathogen fungus.
  相似文献   

7.
Colorectal cancer (CRC) is the second deadliest cancer worldwide. Here, we aimed to study metastasis mechanisms using spatial proteomics in the KM12 cell model. Cells were SILAC‐labeled and fractionated into five subcellular fractions corresponding to: cytoplasm, plasma, mitochondria and ER/golgi membranes, nuclear, chromatin‐bound and cytoskeletal proteins and analyzed with high resolution mass spectrometry. We provide localization data of 4863 quantified proteins in the different subcellular fractions. A total of 1318 proteins with at least 1.5‐fold change were deregulated in highly metastatic KM12SM cells respect to KM12C cells. The protein network organization, protein complexes and functional pathways associated to CRC metastasis was revealed with spatial resolution. Although 92% of the differentially expressed proteins showed the same deregulation in all subcellular compartments, a subset of 117 proteins (8%) showed opposite changes in different subcellular localizations. The chaperonin CCT, the Eif2 and Eif3 initiation of translation and the oxidative phosphorylation complexes together with an important number of guanine nucleotide‐binding proteins, were deregulated in abundance and localization within the metastatic cells. Particularly relevant was the relationship of deregulated protein complexes with exosome secretion. The knowledge of the spatial proteome alterations at subcellular level contributes to clarify the molecular mechanisms underlying colorectal cancer metastasis and to identify potential targets of therapeutic intervention.  相似文献   

8.
9.
研究酵母(yeast)蛋白质相互作用与基因表达谱和蛋白质亚细胞定位的关系.首先,构建了蛋白质相互作用正样本集、负样本集、随机组对负样本集和混合样本集.然后,对于4个数据集中的所有蛋白质对,通过比较它们的基于距离的基因共表达的分布以及它们中具有已知亚细胞定位的蛋白质对的共定位出现率,实现了这些高通量数据的交叉量化分析.结果揭示,与非相互作用蛋白质对相比,相互作用蛋白质对的基因表达谱具有较高的相似性;相互作用蛋白质对更倾向于具有相同的亚细胞定位.结果还揭示出这些蛋白质特征相关的总体趋势.  相似文献   

10.
研究真核蛋白质的亚细胞位点是了解真核蛋白质功能,深入研究蛋白质相关信号通路内在机制的基础。同时,可以为了解 疾病发病机制及为新药研发提供帮助。因此,研究真核蛋白质的亚细胞位点意义十分重大。随着基因组测序的完成,真核蛋白质 序列信息增长迅速,为真核蛋白质亚细胞位点的研究提出了更多的挑战。传统的实验法难以满足蛋白质信息量迅速增长的需求。 而采用生物信息学手段处理大规模数据的计算预测方法,可在较短时间内获得大量真核蛋白质亚细胞位点信息,弥补了实验法 的不足。因此,运用计算预测法预测真核蛋白质的亚细胞位点成为生物信息学领域的研究热点之一。本文主要从提取真核蛋白质 的特征信息、计算预测方法及预测效果的评价三个方面,介绍近年来真核蛋白质亚细胞位点预测的研究进展。  相似文献   

11.
12.
In living systems, the chemical space and functional repertoire of proteins are dramatically expanded through the post-translational modification (PTM) of various amino acid residues. These modifications frequently trigger unique protein–protein interactions (PPIs) – for example with reader proteins that directly bind the modified amino acid residue – which leads to downstream functional outcomes. The modification of a protein can also perturb its PPI network indirectly, for example, through altering its conformation or subcellular localization. Uncovering the network of unique PTM-triggered PPIs is essential to fully understand the roles of an ever-expanding list of PTMs in our biology. In this review, we discuss established strategies and current challenges associated with this endeavor.  相似文献   

13.
Predicting protein localization in budding yeast   总被引:4,自引:0,他引:4  
MOTIVATION: Most of the existing methods in predicting protein subcellular location were used to deal with the cases limited within the scope from two to five localizations, and only a few of them can be effectively extended to cover the cases of 12-14 localizations. This is because the more the locations involved are, the poorer the success rate would be. Besides, some proteins may occur in several different subcellular locations, i.e. bear the feature of 'multiplex locations'. So far there is no method that can be used to effectively treat the difficult multiplex location problem. The present study was initiated in an attempt to address (1) how to efficiently identify the localization of a query protein among many possible subcellular locations, and (2) how to deal with the case of multiplex locations. RESULTS: By hybridizing gene ontology, functional domain and pseudo amino acid composition approaches, a new method has been developed that can be used to predict subcellular localization of proteins with multiplex location feature. A global analysis of the proteins in budding yeast classified into 22 locations was performed by jack-knife cross-validation with the new method. The overall success identification rate thus obtained is 70%. In contrast to this, the corresponding rates obtained by some other existing methods were only 13-14%, indicating that the new method is very powerful and promising. Furthermore, predictions were made for the four proteins whose localizations could not be determined by experiments, as well as for the 236 proteins whose localizations in budding yeast were ambiguous according to experimental observations. However, according to our predicted results, many of these 'ambiguous proteins' were found to have the same score and ranking for several different subcellular locations, implying that they may simultaneously exist, or move around, in these locations. This finding is intriguing because it reflects the dynamic feature of these proteins in a cell that may be associated with some special biological functions.  相似文献   

14.
The subcellular localization of proteins is critical to their biological roles. Moreover, whether a protein is membrane-bound, secreted, or intracellular affects the usefulness of, and the strategies for, using a protein as a diagnostic marker or a target for therapy. We employed a rapid and efficient experimental approach to classify thousands of human gene products as either "membrane-associated/secreted" (MS) or "cytosolic/nuclear" (CN). Using subcellular fractionation methods, we separated mRNAs associated with membranes from those associated with the soluble cytosolic fraction and analyzed these two pools by comparative hybridization to DNA microarrays. Analysis of 11 different human cell lines, representing lymphoid, myeloid, breast, ovarian, hepatic, colon, and prostate tissues, identified more than 5,000 previously uncharacterized MS and more than 6,400 putative CN genes at high confidence levels. The experimentally determined localizations correlated well with in silico predictions of signal peptides and transmembrane domains, but also significantly increased the number of human genes that could be cataloged as encoding either MS or CN proteins. Using gene expression data from a variety of primary human malignancies and normal tissues, we rationally identified hundreds of MS gene products that are significantly overexpressed in tumors compared to normal tissues and thus represent candidates for serum diagnostic tests or monoclonal antibody-based therapies. Finally, we used the catalog of CN gene products to generate sets of candidate markers of organ-specific tissue injury. The large-scale annotation of subcellular localization reported here will serve as a reference database and will aid in the rational design of diagnostic tests and molecular therapies for diverse diseases.  相似文献   

15.
Addressing protein localization within the nucleus   总被引:1,自引:0,他引:1       下载免费PDF全文
Bridging the gap between the number of gene sequences in databases and the number of gene products that have been functionally characterized in any way is a major challenge for biology. A key characteristic of proteins, which can begin to elucidate their possible functions, is their subcellular location. A number of experimental approaches can reveal the subcellular localization of proteins in mammalian cells. However, genome databases now contain predicted sequences for a large number of potentially novel proteins that have yet to be studied in any way, let alone have their subcellular localization determined. Here we ask whether using bioinformatics tools to analyse the sequence of proteins whose subnuclear localizations have been determined can reveal characteristics or signatures that might allow us to predict localization for novel protein sequences.  相似文献   

16.
The subcellular localization of Arf family proteins is generally thought to be determined by their corresponding guanine nucleotide exchange factors. By promoting GTP binding, guanine nucleotide exchange factors induce conformational changes of Arf proteins exposing their N-terminal amphipathic helices, which then insert into the membranes to stabilize the membrane association process. Here, we found that the N-terminal amphipathic motifs of the Golgi-localized Arf family protein, Arfrp1, and the endosome- and plasma membrane–localized Arf family protein, Arl14, play critical roles in spatial determination. Exchanging the amphipathic helix motifs between these two Arf proteins causes the switch of their localizations. Moreover, the amphipathic helices of Arfrp1 and Arl14 are sufficient for cytosolic proteins to be localized into a specific cellular compartment. The spatial determination mediated by the Arfrp1 helix requires its binding partner Sys1. In addition, the residues that are required for the acetylation of the Arfrp1 helix and the myristoylation of the Arl14 helix are important for the specific subcellular localization. Interestingly, Arfrp1 and Arl14 are recruited to their specific cellular compartments independent of GTP binding. Our results demonstrate that the amphipathic motifs of Arfrp1 and Arl14 are sufficient for determining specific subcellular localizations in a GTP-independent manner, suggesting that the membrane association and activation of some Arf proteins are uncoupled.  相似文献   

17.
Activation-induced cytidine deminase (Aid), a unique enzyme that deaminates cytosine in DNA, shuttles between the nucleus and the cytoplasm. A recent study proposed a novel function of Aid in active DNA demethylation via deamination of 5-hydroxymethylcytosine, which is converted from 5-methylcytosine by the Ten-eleven translocation (Tet) family of enzymes. In this study, we examined the effect of simultaneous expression of Aid and Tet family proteins on the subcellular localization of each protein. We found that overexpressed Aid is mainly localized in the cytoplasm, whereas Tet1 and Tet2 are localized in the nucleus, and Tet3 is localized in both the cytoplasm and the nucleus. However, nuclear Tet proteins were gradually translocated to the cytoplasm when co-expressed with Aid. We also show that Aid-mediated translocation of Tet proteins is associated with Aid shuttling. Here we propose a possible role for Aid as a regulator of the subcellular localization of Tet family proteins.  相似文献   

18.
Mutant Presenilin proteins cause early-onset familial Alzheimer's disease in humans and Caenorhabditis elegans Presenilins may facilitate Notch receptor signaling. We have isolated a Drosophila Presenilin homologue and determined the spatial and temporal distribution of the encoded protein as well as its localization relative to the fly Notch protein. In contrast to previous mRNA in situ studies, we find that Presenilin is widely expressed throughout oogenesis, embryogenesis, and imaginal development, and generally accumulates at comparable levels in neuronal and nonneuronal tissues. Double immunolabeling with Notch antibodies revealed that Presenilin and Notch are coexpressed in many tissues throughout Drosophila development and display partially overlapping subcellular localizations, supporting a possible functional link between Presenilin and Notch.  相似文献   

19.
Defects in axonal transport and synaptic dysfunctions are associated with early stages of several neurodegenerative diseases including Alzheimer's, Huntington's, Parkinson's, and prion diseases. Here, we tested the effect of full-length mammalian prion protein (rPrP) converted into three conformationally different isoforms to induce pathological changes regarded as early subcellular hallmarks of prion disease. We employed human embryonal teratocarcinoma NTERA2 cells (NT2) that were terminally differentiated into neuronal and glial cells and co-cultured together. We found that rPrP fibrils but not alpha-rPrP or soluble beta-sheet rich oligomers caused degeneration of neuronal processes. Degeneration of processes was accompanied by a collapse of microtubules and aggregation of cytoskeletal proteins, formation of neuritic beads, and a dramatic change in localization of synaptophysin. Our studies demonstrated the utility of NT2 cells as valuable human model system for elucidating subcellular events of prion pathogenesis, and supported the emerging hypothesis that defects in neuronal transport and synaptic abnormalities are early pathological hallmarks associated with prion diseases.  相似文献   

20.
Small G proteins play a central role in the organization of secretory and endocytotic pathways. The recruitment of some effectors, including vesicle coat proteins, is mediated by the ADP-ribosylation factor (Arf) family. Arf proteins have distinct subcellular localizations. ArfGAPs (Arf GTPase-activating proteins) regulate Arf GTPase activity. Thus, each ArfGAP is distinctly localized to allow it to maintain a specific interaction with its target Arf(s). However, the domains that regulate the subcellular localization of ArfGAPs and the way in which these subcellular localizations affect the target specificities of ArfGAPs remain unclear. Recently, we identified two novel ArfGAPs, SMAP1 (Small ArfGAP protein 1) and SMAP2. In the current study, we identified sequences in the carboxy-terminal region of SMAP2 that are critical for its specific subcellular localization and its specificity for Arf proteins.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号