首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Apoptosis proteins are very important for understanding the mechanism of programmed cell death. The apoptosis protein localization can provide valuable information about its molecular function. The prediction of localization of an apoptosis protein is a challenging task. In our previous work we proposed an increment of diversity (ID) method using protein sequence information for this prediction task. In this work, based on the concept of Chou's pseudo-amino acid composition [Chou, K.C., 2001. Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins: Struct. Funct. Genet. (Erratum: Chou, K.C., 2001, vol. 44, 60) 43, 246-255, Chou, K.C., 2005. Using amphiphilic pseudo-amino acid composition to predict enzyme subfamily classes. Bioinformatics 21, 10-19], a different pseudo-amino acid composition by using the hydropathy distribution information is introduced. A novel ID_SVM algorithm combined ID with support vector machine (SVM) is proposed. This method is applied to three data sets (317 apoptosis proteins, 225 apoptosis proteins and 98 apoptosis proteins). The higher predictive success rates than the previous algorithms are obtained by the jackknife tests.  相似文献   

2.
Knowledge of membrane protein type often provides crucial hints toward determining the function of an uncharacterized membrane protein. With the avalanche of new protein sequences emerging during the post-genomic era, it is highly desirable to develop an automated method that can serve as a high throughput tool in identifying the types of newly found membrane proteins according to their primary sequences, so as to timely make the relevant annotations on them for the reference usage in both basic research and drug discovery. Based on the concept of pseudo-amino acid composition [K.C. Chou, Proteins: Struct. Funct. Genet. 43 (2001) 246-255; Erratum: Proteins: Struct. Funct. Genet. 44 (2001) 60] that has made it possible to incorporate a considerable amount of sequence-order effects by representing a protein sample in terms of a set of discrete numbers, a novel predictor, the so-called "optimized evidence-theoretic K-nearest neighbor" or "OET-KNN" classifier, was proposed. It was demonstrated via the self-consistency test, jackknife test, and independent dataset test that the new predictor, compared with many previous ones, yielded higher success rates in most cases. The new predictor can also be used to improve the prediction quality for, among many other protein attributes, structural class, subcellular localization, enzyme family class, and G-protein coupled receptor type. The OET-KNN classifier will be available as a web-server at http://www.pami.sjtu.edu.cn/kcchou.  相似文献   

3.
Proteins are generally classified into the following 12 subcellular locations: 1) chloroplast, 2) cytoplasm, 3) cytoskeleton, 4) endoplasmic reticulum, 5) extracellular, 6) Golgi apparatus, 7) lysosome, 8) mitochondria, 9) nucleus, 10) peroxisome, 11) plasma membrane, and 12) vacuole. Because the function of a protein is closely correlated with its subcellular location, with the rapid increase in new protein sequences entering into databanks, it is vitally important for both basic research and pharmaceutical industry to establish a high throughput tool for predicting protein subcellular location. In this paper, a new concept, the so-called "functional domain composition" is introduced. Based on the novel concept, the representation for a protein can be defined as a vector in a high-dimensional space, where each of the clustered functional domains derived from the protein universe serves as a vector base. With such a novel representation for a protein, the support vector machine (SVM) algorithm is introduced for predicting protein subcellular location. High success rates are obtained by the self-consistency test, jackknife test, and independent dataset test, respectively. The current approach not only can play an important complementary role to the powerful covariant discriminant algorithm based on the pseudo amino acid composition representation (Chou, K. C. (2001) Proteins Struct. Funct. Genet. 43, 246-255; Correction (2001) Proteins Struct. Funct. Genet. 44, 60), but also may greatly stimulate the development of this area.  相似文献   

4.
A novel approach was developed for predicting the structural classes of proteins based on their sequences. It was assumed that proteins belonging to the same structural class must bear some sort of similar texture on the images generated by the cellular automaton evolving rule [Wolfram, S., 1984. Cellular automation as models of complexity. Nature 311, 419-424]. Based on this, two geometric invariant moment factors derived from the image functions were used as the pseudo amino acid components [Chou, K.C., 2001. Prediction of protein cellular attributes using pseudo amino acid composition. Proteins: Struct., Funct., Genet. (Erratum: ibid., 2001, vol. 44, 60) 43, 246-255] to formulate the protein samples for statistical prediction. The success rates thus obtained on a previously constructed benchmark dataset are quite promising, implying that the cellular automaton image can help to reveal some inherent and subtle features deeply hidden in a pile of long and complicated amino acid sequences.  相似文献   

5.
Cell membranes are vitally important to the life of a cell. Although the basic structure of biological membrane is provided by the lipid bilayer, membrane proteins perform most of the specific functions. Membrane proteins are putatively classified into five different types. Identification of their types is currently an important topic in bioinformatics and proteomics. In this paper, based on the concept of representing protein samples in terms of their pseudo-amino acid composition (Chou, K.C., 2001. Prediction of protein cellular attributes using pseudo amino acid composition. Proteins: Struct. Funct. Genet. 43, 246-255), the fuzzy K-nearest neighbors (KNN) algorithm has been introduced to predict membrane protein types, and high success rates were observed. It is anticipated that, the current approach, which is based on a branch of fuzzy mathematics and represents a new strategy, may play an important complementary role to the existing methods in this area. The novel approach may also have notable impact on prediction of the other attributes, such as protein structural class, protein subcellular localization, and enzyme family class, among many others.  相似文献   

6.
Membrane protein plays an important role in some biochemical process such as signal transduction, transmembrane transport, etc. Membrane proteins are usually classified into five types [Chou, K.C., Elrod, D.W., 1999. Prediction of membrane protein types and subcellular locations. Proteins: Struct. Funct. Genet. 34, 137-153] or six types [Chou, K.C., Cai, Y.D., 2005. J. Chem. Inf. Modelling 45, 407-413]. Designing in silico methods to identify and classify membrane protein can help us understand the structure and function of unknown proteins. This paper introduces an integrative approach, IAMPC, to classify membrane proteins based on protein sequences and protein profiles. These modules extract the amino acid composition of the whole profiles, the amino acid composition of N-terminal and C-terminal profiles, the amino acid composition of profile segments and the dipeptide composition of the whole profiles. In the computational experiment, the overall accuracy of the proposed approach is comparable with the functional-domain-based method. In addition, the performance of the proposed approach is complementary to the functional-domain-based method for different membrane protein types.  相似文献   

7.
The function of a protein is closely correlated with its subcellular location. With the success of human genome project and the rapid increase in the number of newly found protein sequences entering into data banks, it is highly desirable to develop an automated method for predicting the subcellular location of proteins. The establishment of such a predictor will no doubt expedite the functionality determination of newly found proteins and the process of prioritizing genes and proteins identified by genomics efforts as potential molecular targets for drug design. Based on the concept of pseudo amino acid composition originally proposed by K. C. Chou (Proteins: Struct. Funct. Genet. 43: 246–255, 2001), the digital signal processing approach has been introduced to partially incorporate the sequence order effect. One of the remarkable merits by doing so is that many existing tools in mathematics and engineering can be straightforwardly used in predicting protein subcellular location. The results thus obtained are quite encouraging. It is anticipated that the digital signal processing may serve as a useful vehicle for many other protein science areas as well.  相似文献   

8.
Prediction of protein subcellular locations by GO-FunD-PseAA predictor   总被引:8,自引:0,他引:8  
The localization of a protein in a cell is closely correlated with its biological function. With the explosion of protein sequences entering into DataBanks, it is highly desired to develop an automated method that can fast identify their subcellular location. This will expedite the annotation process, providing timely useful information for both basic research and industrial application. In view of this, a powerful predictor has been developed by hybridizing the gene ontology approach [Nat. Genet. 25 (2000) 25], functional domain composition approach [J. Biol. Chem. 277 (2002) 45765], and the pseudo-amino acid composition approach [Proteins Struct. Funct. Genet. 43 (2001) 246; Erratum: ibid. 44 (2001) 60]. As a showcase, the recently constructed dataset [Bioinformatics 19 (2003) 1656] was used for demonstration. The dataset contains 7589 proteins classified into 12 subcellular locations: chloroplast, cytoplasmic, cytoskeleton, endoplasmic reticulum, extracellular, Golgi apparatus, lysosomal, mitochondrial, nuclear, peroxisomal, plasma membrane, and vacuolar. The overall success rate of prediction obtained by the jackknife cross-validation was 92%. This is so far the highest success rate performed on this dataset by following an objective and rigorous cross-validation procedure.  相似文献   

9.
Cell membranes are vitally important to living cells. Although the infrastructure of biological membrane is provided by the lipid bilayer, membrane proteins perform most of the specific functions. Knowledge of membrane protein types often provides crucial hints toward determining the function of an uncharacterized membrane protein. With the avalanche of new protein sequences generated in the post-genomic era, it is highly demanded to develop a high throughput tool in identifying the type of newly found membrane proteins according to their primary sequences, so as to timely annotate them for reference usage in both basic research and drug discovery. To realize this, the key is to establish a powerful identifier that can catch their characteristic sequence patterns for different membrane protein types. However, it is not easy because they are buried in a pile of long and complicated sequences. In this paper, based on the concept of the pseudo-amino acid composition [K.C. Chou, PROTEINS: Struct., Funct., Genet. 43 (2001) 246-255], the low-frequency Fourier spectrum analysis is introduced. The merits by doing so are that the sequence pattern information can be more effectively incorporated into a set of discrete components, and that all the existing prediction algorithms can be straightforwardly used on such a formulation for protein samples. High success rates were observed by the re-substitution test, jackknife test, and independent dataset test, indicating that the low-frequency Fourier spectrum approach may become a very useful tool for membrane protein type prediction. The novel approach also holds a high potential for predicting many other attributes of proteins.  相似文献   

10.
We have combined three mutations previously shown to stabilize lambda repressor against thermal denaturation. Two of these mutations are in helix 3, where Gly-46 and Gly-48 have been replaced by alanines [Hecht, M. H., et al. (1986) Proteins: Struct., Funct., Genet. 1, 43-46]. The other mutation, which replaces Tyr-88 with cysteine, allows the protein to form an intersubunit disulfide bond [Sauer, R. T., et al. (1986) Biochemistry 25, 5992-5998]. Calorimetric measurements show that the two alanine substitutions stabilize repressor by about 8 degrees C, that the disulfide bond stabilizes repressor by about 8 degrees C, and that the triple mutant is 16 degrees C more stable than wild-type repressor.  相似文献   

11.
Gao Y  Shao S  Xiao X  Ding Y  Huang Y  Huang Z  Chou KC 《Amino acids》2005,28(4):373-376
Summary. With the avalanche of new protein sequences we are facing in the post-genomic era, it is vitally important to develop an automated method for fast and accurately determining the subcellular location of uncharacterized proteins. In this article, based on the concept of pseudo amino acid composition (Chou, K.C. Proteins: Structure, Function, and Genetics, 2001, 43: 246–255), three pseudo amino acid components are introduced via Lyapunov index, Bessel function, Chebyshev filter that can be more efficiently used to deal with the chaos and complexity in protein sequences, leading to a higher success rate in predicting protein subcellular location.  相似文献   

12.
With the rapid increment of protein sequence data, it is indispensable to develop automated and reliable predictive methods for protein function annotation. One approach for facilitating protein function prediction is to classify proteins into functional families from primary sequence. Being the most important group of all proteins, the accurate prediction for enzyme family classes and subfamily classes is closely related to their biological functions. In this paper, for the prediction of enzyme subfamily classes, the Chou's amphiphilic pseudo-amino acid composition [Chou, K.C., 2005. Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes. Bioinformatics 21, 10-19] has been adopted to represent the protein samples for training the 'one-versus-rest' support vector machine. As a demonstration, the jackknife test was performed on the dataset that contains 2640 oxidoreductase sequences classified into 16 subfamily classes [Chou, K.C., Elrod, D.W., 2003. Prediction of enzyme family classes. J. Proteome Res. 2, 183-190]. The overall accuracy thus obtained was 80.87%. The significant enhancement in the accuracy indicates that the current method might play a complementary role to the exiting methods.  相似文献   

13.
In the crystal structure of troponin C, the holo C-domain is bound in a head-to-tail fashion to the A-helix of the apo N-domain of a symmetry-related molecule. Using this interaction, we have proposed a model for the calmodulin-peptide complex. We find that the interaction of the C-domain with the A-helix is similar to that observed in the NMR structure of the calmodulin-myosin light chain kinase (MLCK) peptide complex. This similarity in binding has enabled us to make a precise sequence alignment of the target peptides in the calmodulin-binding cleft and to rationalize the amino acid sequence-dependent binding strengths of various peptides. Our model differs from that proposed by Strynadka and James (Proteins Struct. Funct. Genet. 7, 234-248, 1990) in that the peptides are rotated by 100 degrees in the calmodulin binding cleft.  相似文献   

14.
Fast and proper assessment of bio macro-molecular complex structural rigidity as a measure of structural stability can be useful in systematic studies to predict molecular function, and can also enable the design of rapid scoring functions to rank automatically generated bio-molecular complexes. Based on the graph theoretical approach of Jacobs et al. [Jacobs DJ, Rader AJ, Kuhn LA, Thorpe MF (2001) Protein flexibility predictions using graph theory. Proteins: Struct Funct Genet 44:150–165] for expressing molecular flexibility, we propose a new scheme to analyze the structural stability of bio-molecular complexes. This analysis is performed in terms of the identification in interacting subunits of clusters of flappy amino acids (those constituting regions of potential internal motion) that undergo an increase in rigidity at complex formation. Gains in structural rigidity of the interacting subunits upon bio-molecular complex formation can be evaluated by expansion of the network of intra-molecular inter-atomic interactions to include inter-molecular inter-atomic interaction terms. We propose two indices for quantifying this change: one local, which can express localized (at the amino acid level) structural rigidity, the other global to express overall structural stability for the complex. The new system is validated with a series of protein complex structures reported in the protein data bank. Finally, the indices are used as scoring coefficients to rank automatically generated protein complex decoys.  相似文献   

15.
The gene that coded for the subunit of an molecular weight (Mr) 540,000 homohexameric alpha-glucosidase II (alpha-D-glucoside glucohydrolase, EC 3.2.1.20) produced by Bacillus thermoamyloliquefaciens KP1071 (FERM-P8477) growing at 30 to 66 degrees C was expressed in Escherichia coli HB101. The resulting homohexameric enzyme had a half-life of 10 min at 80 degrees C. Its purification and characterization showed that the enzyme was identical with the native one except for the latter deleting 7 N-terminal residues found in the former. The primary sequence of the subunit with 787 residues and an Mr of 91,070 deduced from the gene was 24-34% identical to the corresponding sequences of 15 alpha-glucosidases in the glycosyl hydrolase family 31 from 14 eukaryotic origins and the archaeon Sulfolobus solfataricus 98/2. From the sequence analysis by the neural network method of Rost and Sander [Rost, B. and Sander, C., Proteins: Struct. Funct. Genet., 19, 55-72 (1994)], we inferred that alpha-glucosidase II might make each subunit of 3 secondary structural regions, i.e., one N-terminal beta region, one central alpha/beta region with two catalytic residues Asp407 and Asp484, and one C-terminal beta region.  相似文献   

16.
Three independent three-dimensional reconstructions of the spinach photosystem II-light-harvesting complex supercomplex were derived from single particle analyses of non-stained, vitrified samples imaged by electron microscopy. Each reconstruction was found to differ significantly in the composition of the lumenal oxygen-evolving complex extrinsic proteins. From difference mapping, aided by electron microscopy of negatively stained selectively washed samples, regions of density were assigned to the PsbO and PsbP/PsbQ proteins. Interpretation of the density assigned to the PsbO protein was explored using computer-aided structural predictions. PsbO is calculated to be mainly a beta-protein (38% beta) composed of two domains within an overall elongated shape (Pazos, F., Heredia, P., Valencia, A., and De Las Rivas, J. (2001) Proteins Struct. Funct. Genet. 45, 372-381). The positioning and fitting of the proposed structural model for the PsbO protein within the three-dimensional map indicated that there is a single copy per reaction center. Moreover, the structural model derived for PsbO, together with difference mapping, indicates that this protein stretches across the surface of the reaction center with its N- and C-terminal domains located toward the CP47 and CP43 side, respectively. This structural assignment is discussed in terms of the recent x-ray-derived cyanobacterial model of PSII (Zouni, A., Witt, H.-T., Kern, J., Fromme, P., Krauss, N., Saenger, W., and Orth, P. (2001) Nature 409, 739-743).  相似文献   

17.
Hemoglobin Ypsilanti (HbY) is a stable tetrameric hemoglobin that binds oxygen with little or no cooperativity and with high affinity [Doyle, M. L., et al. (1992) Proteins: Struct., Funct., Genet. 14, 351-362]. It displays an especially large quaternary enhancement effect. An X-ray crystallographic study [Smith, F. R., et al. (1991) Proteins: Struct., Funct., Genet. 10, 81-91] of the carboxy derivative of this hemoglobin (COHbY) revealed a new quaternary structure that partially resembles the recently described R2 structure [Silva, M. M., et al. (1992) J. Biol. Chem. 267, 17248-17256]. Very little is known about either the solution phase conformations of the liganded and deoxy forms of HbY or the molecular basis for the large quaternary enhancement effect (Doyle et al., 1992). In this study, near-IR absorption, Soret-enhanced Raman, and UV (229 nm) resonance Raman spectroscopies are used to probe the liganded and deoxy derivatives of HbY in solution. Nanosecond time-resolved near-IR absorption measurements are used to expose the relaxation properties of the photoproduct of COHbY. Time-resolved (Soret band) absorption is used to generate the geminate and solvent phase ligand rebinding curves for photodissociated COHbY. The spectroscopic results indicate that COHbY has an R-like conformation with respect to both the proximal heme pocket and the hinge region of the alpha 1 beta 2 interface. The deoxy derivative of HbY has spectroscopic features that are very similar to those observed for species assigned to the deoxy R or half-liganded R conformations of human adult hemoglobin (HbA). The 10 ns to 100 micros relaxation properties of the photoproduct of COHbY are distinctly different from those of HbA in that for HbY, little if any tertiary or quaternary relaxation is observed. The near-absence of relaxation in the HbY photoproduct explains the differences in the geminate and solvent phase CO recombination between HbA and HbY. The impact of the conformational and relaxation properties of HbY on the geminate rebinding process forms the basis of a model that accounts for the large quaternary enhancement effect reported for HbY (Doyle et al., 1992). In addition, the spectroscopic data and the X-ray crystallographic results explain the slow relaxation for HbY and the near-absence of cooperative ligand binding for this protein based on the behavior of the penultimate tyrosines.  相似文献   

18.
Small-angle X-ray scattering (SAXS) measurements were used to characterize vitronectin, a circulatory protein found in human plasma that functions in regulating cell adhesion and migration, as well as proteolytic cascades that affect blood coagulation, fibrinolysis, and pericellular proteolysis. SAXS measurements were taken over a 3-fold range of protein concentrations, yielding data that characterize a monodisperse system of particles with an average radius of gyration of 30.3 +/- 0.6 A and a maximum linear dimension of 110 A. Shape restoration was applied to the data to produce two models of the solution structure of the ligand-free protein. A low-resolution model of the protein was generated that indicates the protein to be roughly peanut-shaped. A better understanding of the domain structure of vitronectin resulted from low-resolution models developed from available high-resolution structures of the domains. These domains include the N-terminal domain that was determined experimentally by NMR [Mayasundari, A., Whittemore, N. A., Serpersu, E. H., and Peterson, C. B. (2004) J. Biol. Chem. 279, 29359-29366] and the docked structure of the central and C-terminal domains that were determined by computational threading [Xu, D., Baburaj, K., Peterson, C. B., and Xu, Y. (2001) Proteins: Struct., Funct., Genet. 44, 312-320]. This model provides an indication of the disposition of the central domain and C-terminal heparin-binding domains of vitronectin with respect to the N-terminal somatomedin B (SMB) domain. This model constructed from the available domain structures, which agrees with the low-resolution model produced from the SAXS data, shows the SMB domain well separated from the central and heparin-binding domains by a disordered linker (residues 54-130). Also, binding sites within the SMB domain are predicted to be well exposed to the surrounding solvent for ease of access to its various ligands.  相似文献   

19.
We report the structures of the crystallographic dimer of porcine pancreatic IB phospholipase A(2) (PLA2) with either five sulfate or phosphate anions bound. In each structure, one molecule of a tetrahedral mimic MJ33 [1-hexadecyl-3-(trifluoroethyl)-sn-glycero-2-phosphomethanol] and the five anions are shared between the two subunits of the dimer. The sn-2-phosphate of MJ33 is bound in the active site of one subunit (A), and the alkyl chain extends into the active site slot of the second subunit (B) across the subunit-subunit interface. The two subunits are packed together with a large hydrophobic and desolvated surface buried between them along with the five anions that define a plane. The anions bind by direct contact with two cationic residues (R6 and K10) per subunit and through closer-range H-bonding interactions with other polarizable ligands. These features of the "dimer" suggest that the binding of PLA2 to the anionic groups at the anionic interface may be dominated by coordination through H-bonding with only a partial charge compensation needed. Remarkably, the plane defined by the contact surface is similar to the i-face of the enzyme [Ramirez, F., and Jain, M. K. (1991) Proteins: Struct., Funct., Genet. 9, 229-239], which has been proposed to make contact with the substrate interface for the interfacial catalytic turnover. Additionally, these structures not only offer a view of the active PLA2 complexed to an anionic interface but also provide insight into the environment of the tetrahedral intermediate in the rate-limiting chemical step of the turnover cycle. Taken together, our results offer an atomic-resolution structural view of the i-face interactions of the active form of PLA2 associated to an anionic interface.  相似文献   

20.
Summary We address the question how well proteins can be modelled on the basis of NMR data, when these data are incorporated into the protein model using distance restraints in a molecular dynamics simulation. We found, using HPr as a model protein, that distance restraining freezes the essential motion of proteins, as defined by Amadei et al. [Amadei, A., Linssen, A.B.M. and Berendsen, H.J.C. (1993) Protein Struct. Funct. Genet., 17, 412–425]. We discuss how modelling protocols can be improved in order to solve this problem.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号