首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
《Genomics》2019,111(6):1831-1838
Knowing the protein localization can provide valuable information resource for elucidating protein function. In recent years, with the advances of human genomics and proteomics, it is possible to characterize human proteins that are located in different subcellular localizations. In this study, we used the topological properties and biological properties to characterize human proteins with six subcellular localizations. Almost all of these properties were found to be significantly different among six protein categories. Network topology analysis indicated that several significant topological properties, including the degree and k-core, were higher for the mitochondrial proteins. Biological property analysis showed that the nuclear proteins appeared to be correlated with important biological function. We hope these findings may provide some important help for comprehensive understanding the biological function of proteins, and prediction of protein subcellular localizations in human.  相似文献   

2.
Subcellular location of protein is constructive information in determining its function, screening for drug candidates, vaccine design, annotation of gene products and in selecting relevant proteins for further studies. Computational prediction of subcellular localization deals with predicting the location of a protein from its amino acid sequence. For a computational localization prediction method to be more accurate, it should exploit all possible relevant biological features that contribute to the subcellular localization. In this work, we extracted the biological features from the full length protein sequence to incorporate more biological information. A new biological feature, distribution of atomic composition is effectively used with, multiple physiochemical properties, amino acid composition, three part amino acid composition, and sequence similarity for predicting the subcellular location of the protein. Support Vector Machines are designed for four modules and prediction is made by a weighted voting system. Our system makes prediction with an accuracy of 100, 82.47, 88.81 for self-consistency test, jackknife test and independent data test respectively. Our results provide evidence that the prediction based on the biological features derived from the full length amino acid sequence gives better accuracy than those derived from N-terminal alone. Considering the features as a distribution within the entire sequence will bring out underlying property distribution to a greater detail to enhance the prediction accuracy.  相似文献   

3.
The subcellular localization of a protein is important for its proper function. Escherichia coli MinE is a small protein with clear subcellular localization, which provides a good model to study protein localization mechanism. In the present study, a series of recombinant minEs truncated in one end or in the middle regions, fused with egfp, was constructed, and these recombinant proteins could compete to function with the chromosomal MinE. Our results showed that the sequences related to the subcellular localization of MinE span several functional domains, demonstrating that MinE positioning in cells depends on multiple factors. The eGFP fusions with some truncated MinE from N-terminal resulted in different cell phenotypes and localization features, implying that these fusions can interfere chromosomal MinE’s function, similar to MinE36–88 phenotype in the previous report. The amino acid in the region (32–48) is sensitive to change MinE conformation and influence its dimerization. Some truncated protein structure could be unstable. Thus, the MinE localization is prerequisite for its proper anti-MinCD function and some new features of MinE were demonstrated. This approach can be extended for subcellular localization research for other essential proteins.  相似文献   

4.
The effect of the culture media on the composition of the outer membrane protein of Vibrio vulnificus strain 393 from human blood was examined. Only one major outer membrane protein, with an apparent molecular weight of 37,000 (37K protein) and 34,000 (34K protein), was formed in the cells grown in 3% NaCl-BHI broth and chemically defined medium, respectively. The production of one major outer membrane protein was also observed in other isolates from humans and asari clam when they were grown in 3% NaCl-BHI broth. On the other hand, three major outer membrane proteins, with apparent molecular weights of 48,000 (48K protein), 37,000 (37K protein), and 34,000 (34K protein), were produced in the cells grown in 3% NaCl-nutrient broth. Three proteins, 48K, 37K, and 34K from strain 393, were purified and the amino acid compositions were determined. Although there was a little difference in the composition of amino acid among three proteins, the amino acid compositions of the three porin-like proteins showed characteristic properties of the porins of Escherichia coli and Salmonella typhimurium. Immunoblot analysis of the outer membrane proteins from four vibrios, E. coli, and S. typhimurium using monospecific antisera against these three porin-like proteins showed that only the antiserum against 37K protein cross-reacted with the outer membrane proteins from all the strains tested.  相似文献   

5.

Background  

Protein subcellular localization is an important determinant of protein function and hence, reliable methods for prediction of localization are needed. A number of prediction algorithms have been developed based on amino acid compositions or on the N-terminal characteristics (signal peptides) of proteins. However, such approaches lead to a loss of contextual information. Moreover, where information about the physicochemical properties of amino acids has been used, the methods employed to exploit that information are less than optimal and could use the information more effectively.  相似文献   

6.
Mimicking cellular sorting improves prediction of subcellular localization   总被引:27,自引:0,他引:27  
Predicting the native subcellular compartment of a protein is an important step toward elucidating its function. Here we introduce LOCtree, a hierarchical system combining support vector machines (SVMs) and other prediction methods. LOCtree predicts the subcellular compartment of a protein by mimicking the mechanism of cellular sorting and exploiting a variety of sequence and predicted structural features in its input. Currently LOCtree does not predict localization for membrane proteins, since the compositional properties of membrane proteins significantly differ from those of non-membrane proteins. While any information about function can be used by the system, we present estimates of performance that are valid when only the amino acid sequence of a protein is known. When evaluated on a non-redundant test set, LOCtree achieved sustained levels of 74% accuracy for non-plant eukaryotes, 70% for plants, and 84% for prokaryotes. We rigorously benchmarked LOCtree in comparison to the best alternative methods for localization prediction. LOCtree outperformed all other methods in nearly all benchmarks. Localization assignments using LOCtree agreed quite well with data from recent large-scale experiments. Our preliminary analysis of a few entirely sequenced organisms, namely human (Homo sapiens), yeast (Saccharomyces cerevisiae), and weed (Arabidopsis thaliana) suggested that over 35% of all non-membrane proteins are nuclear, about 20% are retained in the cytosol, and that every fifth protein in the weed resides in the chloroplast.  相似文献   

7.
Identifying the subcellular localization of proteins is particularly helpful in the functional annotation of gene products. In this study, we use Machine Learning and Exploratory Data Analysis (EDA) techniques to examine and characterize amino acid sequences of human proteins localized in nine cellular compartments. A dataset of 3,749 protein sequences representing human proteins was extracted from the SWISS-PROT database. Feature vectors were created to capture specific amino acid sequence characteristics. Relative to a Support Vector Machine, a Multi-layer Perceptron, and a Naive Bayes classifier, the C4.5 Decision Tree algorithm was the most consistent performer across all nine compartments in reliably predicting the subcellular localization of proteins based on their amino acid sequences (average Precision=0.88; average Sensitivity=0.86). Furthermore, EDA graphics characterized essential features of proteins in each compartment. As examples, proteins localized to the plasma membrane had higher proportions of hydrophobic amino acids; cytoplasmic proteins had higher proportions of neutral amino acids; and mitochondrial proteins had higher proportions of neutral amino acids and lower proportions of polar amino acids. These data showed that the C4.5 classifier and EDA tools can be effective for characterizing and predicting the subcellular localization of human proteins based on their amino acid sequences.  相似文献   

8.
Ycf1p is a member of the ATP-binding cassette transporter family of membrane proteins. Strong sequence similarity has been observed between Ycf1p, the cystic fibrosis transmembrane conductance regulator (CFTR) and multidrug resistance protein (MRP). In this work, we have examined the functional significance of several of the conserved amino acid residues and the genetic requirements for Ycf1p subcellular localization. Biochemical fractionation experiments have established that Ycf1p, expressed at single-copy gene levels, co-fractionates with the vacuolar membrane and that this co-fractionation is independent of vps15 , vps34 or end3 gene function. Several cystic fibrosis-associated alleles of the CFTR were introduced into Ycf1p and found to elicit defects analogous to those seen in the CFTR. An amino-terminal extension shared between Ycf1p and MRP, but absent from CFTR, was found to be required for Ycf1p function, but not its subcellular localization. Mutant forms of Ycf1p were also identified that exhibited enhanced biological function relative to the wild-type protein. These studies indicate that Ycf1p will provide a simple, genetically tractable model system for the study of the trafficking and function of ATP-binding cassette transporter proteins, such as the CFTR and MRP.  相似文献   

9.
Wrch-1 is a Rho family GTPase that shares strong sequence and functional similarity with Cdc42. Like Cdc42, Wrch-1 can promote anchorage-independent growth transformation. We determined that activated Wrch-1 also promoted anchorage-dependent growth transformation of NIH 3T3 fibroblasts. Wrch-1 contains a distinct carboxyl-terminal extension not found in Cdc42, suggesting potential differences in subcellular location and function. Consistent with this, we found that Wrch-1 associated extensively with plasma membrane and endosomes, rather than with cytosol and perinuclear membranes like Cdc42. Like Cdc42, Wrch-1 terminates in a CAAX tetrapeptide (where C is cysteine, A is aliphatic amino acid, and X is any amino acid) motif (CCFV), suggesting that Wrch-1 may be prenylated similarly to Cdc42. Most surprisingly, unlike Cdc42, Wrch-1 did not incorporate isoprenoid moieties, and Wrch-1 membrane localization was not altered by inhibitors of protein prenylation. Instead, we showed that Wrch-1 is modified by the fatty acid palmitate, and pharmacologic inhibition of protein palmitoylation caused mislocalization of Wrch-1. Most interestingly, mutation of the second cysteine of the CCFV motif (CCFV > CSFV), but not the first, abrogated both Wrch-1 membrane localization and transformation. These results suggest that Wrch-1 membrane association, subcellular localization, and biological activity are mediated by a novel membrane-targeting mechanism distinct from that of Cdc42 and other isoprenylated Rho family GTPases.  相似文献   

10.
MOTIVATION: The subcellular location of a protein is closely correlated to its function. Thus, computational prediction of subcellular locations from the amino acid sequence information would help annotation and functional prediction of protein coding genes in complete genomes. We have developed a method based on support vector machines (SVMs). RESULTS: We considered 12 subcellular locations in eukaryotic cells: chloroplast, cytoplasm, cytoskeleton, endoplasmic reticulum, extracellular medium, Golgi apparatus, lysosome, mitochondrion, nucleus, peroxisome, plasma membrane, and vacuole. We constructed a data set of proteins with known locations from the SWISS-PROT database. A set of SVMs was trained to predict the subcellular location of a given protein based on its amino acid, amino acid pair, and gapped amino acid pair compositions. The predictors based on these different compositions were then combined using a voting scheme. Results obtained through 5-fold cross-validation tests showed an improvement in prediction accuracy over the algorithm based on the amino acid composition only. This prediction method is available via the Internet.  相似文献   

11.
Shi JY  Zhang SW  Pan Q  Zhou GP 《Amino acids》2008,35(2):321-327
In the Post Genome Age, there is an urgent need to develop the reliable and effective computational methods to predict the subcellular localization for the explosion of newly found proteins. Here, a novel method of pseudo amino acid (PseAA) composition, the so-called “amino acid composition distribution” (AACD), is introduced. First, a protein sequence is divided equally into multiple segments. Then, amino acid composition of each segment is calculated in series. After that, each protein sequence can be represented by a feature vector. Finally, the feature vectors of all sequences thus obtained are further input into the multi-class support vector machines to predict the subcellular localization. The results show that AACD is quite effective in representing protein sequences for the purpose of predicting protein subcellular localization.  相似文献   

12.
Proper cellular localization is required for the function of many proteins. The CaaX prenyltransferases (where CaaX indicates a cysteine followed by two aliphatic amino acids and a variable amino acid) direct the subcellular localization of a large group of proteins by catalyzing the attachment of hydrophobic isoprenoid moieties onto C-terminal CaaX motifs, thus facilitating membrane association. This group of enzymes includes farnesyltransferase (Ftase) and geranylgeranyltransferase-I (Ggtase-1). Classically, the variable (X) amino acid determines whether a protein will be an Ftase or Ggtase-I substrate, with Ggtase-I substrates often containing CaaL motifs. In this study, we identify the gene encoding the β subunit of Ggtase-I (CDC43) and demonstrate that Ggtase-mediated activity is not essential. However, Cryptococcus neoformans CDC43 is important for thermotolerance, morphogenesis, and virulence. We find that Ggtase-I function is required for full membrane localization of Rho10 and the two Cdc42 paralogs (Cdc42 and Cdc420). Interestingly, the related Rac and Ras proteins are not mislocalized in the cdc43Δ mutant even though they contain similar CaaL motifs. Additionally, the membrane localization of each of these GTPases is dependent on the prenylation of the CaaX cysteine. These results indicate that C. neoformans CaaX prenyltransferases may recognize their substrates in a unique manner from existing models of prenyltransferase specificity. It also suggests that the C. neoformans Ftase, which has been shown to be more important for C. neoformans proliferation and viability, may be the primary prenyltransferase for proteins that are typically geranylgeranylated in other species.  相似文献   

13.
The Ras-like small GTPases RalA and RalB are well validated effectors of RAS oncogene-driven human cancer growth, and pharmacologic inhibitors of Ral function may provide an effective anti-Ras therapeutic strategy. Intriguingly, although RalA and RalB share strong overall amino acid sequence identity, exhibit essentially identical structural and biochemical properties, and can utilize the same downstream effectors, they also exhibit divergent and sometimes opposing roles in the tumorigenic and metastatic growth of different cancer types. These distinct biological functions have been attributed largely to sequence divergence in their carboxyl-terminal hypervariable regions. However, the role of posttranslational modifications signaled by the hypervariable region carboxyl-terminal tetrapeptide CAAX motif (C = cysteine, A = aliphatic amino acid, X = terminal residue) in Ral isoform-selective functions has not been addressed. We determined that these modifications have distinct roles and consequences. Both RalA and RalB require Ras converting CAAX endopeptidase 1 (RCE1) for association with the plasma membrane, albeit not with endomembranes, and loss of RCE1 caused mislocalization as well as sustained activation of both RalA and RalB. In contrast, isoprenylcysteine carboxylmethyltransferase (ICMT) deficiency disrupted plasma membrane localization only of RalB, whereas RalA depended on ICMT for efficient endosomal localization. Furthermore, the absence of ICMT increased stability of RalB but not RalA protein. Finally, palmitoylation was critical for subcellular localization of RalB but not RalA. In summary, we have identified striking isoform-specific consequences of distinct CAAX-signaled posttranslational modifications that contribute to the divergent subcellular localization and activity of RalA and RalB.  相似文献   

14.
现有蛋白质亚细胞定位方法针对水溶性蛋白质而设计,对跨膜蛋白并不适用。而专门的跨膜拓扑预测器,又不是为亚细胞定位而设计的。文章改进了跨膜拓扑预测器TMPHMMLoc的模型结构,设计了一个新的二阶隐马尔可夫模型;采用推广到二阶模型的Baum-Welch算法估计模型参数,并把将各个亚细胞位置建立的模型整合为一个预测器。数据集上测试结果表明,此方法性能显著优于针对可溶性蛋白设计的支持向量机方法和模糊k最邻近方法,也优于TMPHMMLoc中提出的隐马尔可夫模型方法,是一个有效的跨膜蛋白亚细胞定位预测方法。  相似文献   

15.
The protein 4.1 superfamily is comprised of a diverse group of cytoplasmic proteins, many of which have been shown to associate with the plasma membrane via binding to specific transmembrane proteins. Coracle, a Drosophila protein 4.1 homologue, is required during embryogenesis and is localized to the cytoplasmic face of the septate junction in epithelial cells. Using in vitro mutagenesis, we demonstrate that the amino-terminal 383 amino acids of Coracle define a functional domain that is both necessary and sufficient for proper septate junction localization in transgenic embryos. Genetic mutations within this domain disrupt the subcellular localization of Coracle and severely affect its genetic function, indicating that correct subcellular localization is essential for Coracle function. Furthermore, the localization of Coracle and the transmembrane protein Neurexin to the septate junction display an interdependent relationship, suggesting that Coracle and Neurexin interact with one another at the cytoplasmic face of the septate junction. Consistent with this notion, immunoprecipitation and in vitro binding studies demonstrate that the amino-terminal 383 amino acids of Coracle and cytoplasmic domain of Neurexin interact directly. Together these results indicate that Coracle provides essential membrane-organizing functions at the septate junction, and that these functions are carried out by an amino-terminal domain that is conserved in all protein 4.1 superfamily members.  相似文献   

16.

Background  

Amino acids in proteins are not used equally. Some of the differences in the amino acid composition of proteins are between species (mainly due to nucleotide composition and lifestyle) and some are between proteins from the same species (related to protein function, expression or subcellular localization, for example). As several factors contribute to the different amino acid usage in proteins, it is difficult both to analyze these differences and to separate the contributions made by each factor.  相似文献   

17.
[目的]G蛋白信号调控因子(RGS)作为G蛋白信号转导途径的负调控因子,在植物病原菌的致病性和有性生殖调控方面发挥着重要作用.研究真菌中RGS蛋白类型与其理化性质及特征的关系,为今后深入开展不同真菌中具有不同类别RGS的功能解析打下坚实的理论基础.[方法]前期对模式生物、病原菌、非致病菌等49个真菌中229个RGS蛋白...  相似文献   

18.
Gao QB  Wang ZZ  Yan C  Du YH 《FEBS letters》2005,579(16):3444-3448
To understand the structure and function of a protein, an important task is to know where it occurs in the cell. Thus, a computational method for properly predicting the subcellular location of proteins would be significant in interpreting the original data produced by the large-scale genome sequencing projects. The present work tries to explore an effective method for extracting features from protein primary sequence and find a novel measurement of similarity among proteins for classifying a protein to its proper subcellular location. We considered four locations in eukaryotic cells and three locations in prokaryotic cells, which have been investigated by several groups in the past. A combined feature of primary sequence defined as a 430D (dimensional) vector was utilized to represent a protein, including 20 amino acid compositions, 400 dipeptide compositions and 10 physicochemical properties. To evaluate the prediction performance of this encoding scheme, a jackknife test based on nearest neighbor algorithm was employed. The prediction accuracies for cytoplasmic, extracellular, mitochondrial, and nuclear proteins in the former dataset were 86.3%, 89.2%, 73.5% and 89.4%, respectively, and the total prediction accuracy reached 86.3%. As for the prediction accuracies of cytoplasmic, extracellular, and periplasmic proteins in the latter dataset, the prediction accuracies were 97.4%, 86.0%, and 79.7, respectively, and the total prediction accuracy of 92.5% was achieved. The results indicate that this method outperforms some existing approaches based on amino acid composition or amino acid composition and dipeptide composition.  相似文献   

19.
The function of the protein is closely correlated with its subcellular localization. Probing into the mechanism of protein sorting and predicting protein subcellular location can provide important clues or insights for understanding the function of proteins. In this paper, we introduce a new PseAAC approach to encode the protein sequence based on the physicochemical properties of amino acid residues. Each of the protein samples was defined as a 146D (dimensional) vector including the 20 amino acid composition components and 126 adjacent triune residues contents. To evaluate the effectiveness of this encoding scheme, we did jackknife tests on three datasets using the support vector machine algorithm. The total prediction accuracies are 84.9%, 91.2%, and 92.6%, respectively. The satisfactory results indicate that our method could be a useful tool in the area of bioinformatics and proteomics.  相似文献   

20.
We performed an extensive mutational analysis of the canonical mouse odorant receptor (OR) M71 to determine the properties of ORs that inhibit plasma membrane trafficking in heterologous expression systems. We employed the use of the M71::GFP fusion protein to directly assess plasma membrane localization and functionality of M71 in heterologous cells in vitro or in olfactory sensory neurons (OSNs) in vivo. OSN expression of M71::GFP show only small differences in activity compared to untagged M71. However, M71::GFP could not traffic to the plasma membrane even in the presence of proposed accessory proteins RTP1S or mβ2AR. To ask if ORs contain an internal “kill sequence”, we mutated ~15 of the most highly conserved OR specific amino acids not found amongst the trafficking non-OR GPCR superfamily; none of these mutants rescued trafficking. Addition of various amino terminal signal sequences or different glycosylation motifs all failed to produce trafficking. The addition of the amino and carboxy terminal domains of mβ2AR or the mutation Y289A in the highly conserved GPCR motif NPxxY does not rescue plasma membrane trafficking. The failure of targeted mutagenesis on rescuing plasma membrane localization in heterologous cells suggests that OR trafficking deficits may not be attributable to conserved collinear motifs, but rather the overall amino acid composition of the OR family. Thus, we performed an in silico analysis comparing the OR and other amine receptor superfamilies. We find that ORs contain fewer charged residues and more hydrophobic residues distributed throughout the protein and a conserved overall amino acid composition. From our analysis, we surmise that it may be difficult to traffic ORs at high levels to the cell surface in vitro, without making significant amino acid modifications. Finally, we observed specific increases in methionine and histidine residues as well as a marked decrease in tryptophan residues, suggesting that these changes provide ORs with special characteristics needed for them to function in olfactory neurons.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号