首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Prediction of protein structural class from the amino acid sequence   总被引:9,自引:0,他引:9  
P Klein  C Delisi 《Biopolymers》1986,25(9):1659-1672
The multidimensional statistical technique of discriminant analysis is used to allocate amino acid sequences to one of four secondary structural classes: high α content, high β content, mixed α and β, low content of ordered structure. Discrimination is based on four attributes: estimates of percentages of α and β structures, and regular variations in the hydrophobic values of residues along the sequence, occurring with periods of 2 and 3.6 residues. The reliability of the method, estimated by classifying 138 sequences from the Brookhaven Protein Data Bank, is 80%, with no misallocations between α-rich and β-rich classes. The reliability can be increased to 84% by making no allocation for proteins classified with odds close to 1. Classification using previously developed secondary structural prediction methods is considerably less reliable, the best result being 64% obtained using predictions based on the Delphi method.  相似文献   

2.
Discriminant analysis assigns objects to one of several classes on the basis of attributes which characterize the objects. The success of classification depends on the selection of discriminatory attributes and on the choice of an assignment rule. In this paper we focus on the latter and discuss ways to obtain nonlinear classification rules through maximum likelihood, canonical components and projection pursuit. We use both linear and nonlinear methods to classify proteins into three secondary structural types: alpha, beta, and mixed alpha and beta or irregular. Using simple attributes, dependent on amino acid properties, we show that the rate of incorrect classification can be decreased by more than 15% when nonlinear methods are used.  相似文献   

3.
The Saccharomyces cerevisiae ribosomal stalk is made of five components, the 32-kDa P0 and four 12-kDa acidic proteins, P1alpha, P1beta, P2alpha, and P2beta. The P0 carboxyl-terminal domain is involved in the interaction with the acidic proteins and resembles their structure. Protein chimeras were constructed in which the last 112 amino acids of P0 were replaced by the sequence of each acidic protein, yielding four fusion proteins, P0-1alpha, P0-1beta, P0-2alpha, and P0-2beta. The chimeras were expressed in P0 conditional null mutant strains in which wild-type P0 is not present. In S. cerevisiae D4567, which is totally deprived of acidic proteins, the four fusion proteins can replace the wild-type P0 with little effect on cell growth. In other genetic backgrounds, the chimeras either reduce or increase cell growth because of their effect on the ribosomal stalk composition. An analysis of the stalk proteins showed that each P0 chimera is able to strongly interact with only one acidic protein. The following associations were found: P0-1alpha.P2beta, P0-1beta.P2alpha, P0-2alpha.P1beta, and P0-2beta.P1alpha. These results indicate that the four acidic proteins do not form dimers in the yeast ribosomal stalk but interact with each other forming two specific associations, P1alpha.P2beta and P1beta.P2alpha, which have different structural and functional roles.  相似文献   

4.
A protein is generally classified into one of the following four structural classes: all alpha, all beta, alpha+beta and alpha/beta. In this paper, based on the weighting to the 20 constituent amino acids, a new method is proposed for predicting the structural class of a protein according to its amino acid composition. The 20 weighting parameters, which reflect the different properties of the 20 constituent amino acids, have been obtained from a training set of proteins through the linear-programming approach. The rate of correct prediction for a training set of proteins by means of the new method was 100%, whereas the highest rate of previous methods was 82.8%. Furthermore, the results showed that the more numerous training proteins, the more effective the new method.  相似文献   

5.
The bulk hydrophobic character for the 20 natural amino acid residues, has been obtained from a database of 60 protein structures, grouped in the four structural classes alpha alpha, beta beta, alpha + beta and alpha/beta. The hydrophobicity coefficients thus obtained are compared with Ponnuswamy's original values using scales normalized to average = 0.0 and standard deviation = 1.0. Even though most of the amino acid residues do not change their hydropathic character in the different structural classes, their behaviour suggests the convenience that averaging methods should only consider proteins of the same structural class and that this information should be included in the secondary structure methods.  相似文献   

6.
In this study we classified regions of random coil into four types: coil between alpha helix and beta strand, coil between beta strand and alpha helix, coil between two alpha helices and coil between two beta strands. This classification may be considered as natural. We used 610 3D structures of proteins collected from the Protein Data Bank from bacteria with low, average and high genomic GC-content. Relatively short regions of coil are not random: certain amino acid residues are more or less frequent in each of the types of coil. Namely, hydrophobic amino acids with branched side chains (Ile, Val and Leu) are rare in coil between two beta strands, unlike some acrophilic amino acids (Asp, Asn and Gly). In contrast, coil between two alpha helices is enriched by Leu. Regions of coil between alpha helix and beta strand are enriched by positively charged amino acids (Arg and Lys), while the usage of residues with side chains possessing hydroxyl group (Ser and Thr) is low in them, in contrast to the regions of coil between beta strand and alpha helix. Regions of coil between beta strand and alpha helix are significantly enriched by Cys residues. The response to the symmetric mutational pressure (AT-pressure or GC-pressure) is also quite different for four types of coil. The most conserved regions of coil are “connecting bridges” between beta strand and alpha helix, since their amino acid content shows less strong dependence on GC-content of genes than amino acid contents of other three types of coil. Possible causes and consequences of the described differences in amino acid content distribution between different types of random coil have been discussed.  相似文献   

7.
Wang ZX  Yuan Z 《Proteins》2000,38(2):165-175
Proteins of known structures are usually classified into four structural classes: all-alpha, all-beta, alpha+beta, and alpha/beta type of proteins. A number of methods to predicting the structural class of a protein based on its amino acid composition have been developed during the past few years. Recently, a component-coupled method was developed for predicting protein structural class according to amino acid composition. This method is based on the least Mahalanobis distance principle, and yields much better predicted results in comparison with the previous methods. However, the success rates reported for structural class prediction by different investigators are contradictory. The highest reported accuracies by this method are near 100%, but the lowest one is only about 60%. The goal of this study is to resolve this paradox and to determine the possible upper limit of prediction rate for structural classes. In this paper, based on the normality assumption and the Bayes decision rule for minimum error, a new method is proposed for predicting the structural class of a protein according to its amino acid composition. The detailed theoretical analysis indicates that if the four protein folding classes are governed by the normal distributions, the present method will yield the optimum predictive result in a statistical sense. A non-redundant data set of 1,189 protein domains is used to evaluate the performance of the new method. Our results demonstrate that 60% correctness is the upper limit for a 4-type class prediction from amino acid composition alone for an unknown query protein. The apparent relatively high accuracy level (more than 90%) attained in the previous studies was due to the preselection of test sets, which may not be adequately representative of all unrelated proteins.  相似文献   

8.
A protein is usually classified into one of the following four structural classes: all alpha, all beta, (alpha + beta) and alpha/beta. In this paper, based on the maximum correlation-coefficient principle, a new formulation is proposed for predicting the structural class of a protein according to its amino acid composition. Calculations have been made for a development set of proteins from which the amino acid compositions for the standard structural classes were derived, and an independent set of proteins which are outside the development set. The former can test the self consistency of a method and the latter can test its extrapolating effectiveness. In both cases, the results showed that the new method gave a considerably higher rate of correct prediction than any of the previous methods, implying that a significant improvement has been achieved by implementing the maximum-correlation-coefficient principle in the new method.  相似文献   

9.
Deciphering the native conformation of proteins from their amino acid sequences is one of the most challenging problems in molecular biology. Information on the secondary structure of a protein can be helpful in understanding its native folded state. In our earlier work on molecular chaperones, we have analyzed the hydrophobic and charged patches, short-, medium- and long-range contacts and residue distributions along the sequence. In this article, we have made an attempt to predict the structural class of globular and chaperone proteins based on the information obtained from residue distributions. This method predicts the structural class with an accuracy of 93 and 96%, respectively, for the four- and three-state models in a training set of 120 globular proteins, and 90 and 96%, respectively, for a test set of 80 proteins. We have used this information and methodology to predict the structural classes of chaperones. Interestingly most of the chaperone proteins are predicted under alpha/beta or mixed folding type.  相似文献   

10.
Proteins are generally classified into four structural classes: all-alpha proteins, all-beta proteins, alpha + beta proteins, and alpha/beta proteins. In this article, a protein is expressed as a vector of 20-dimensional space, in which its 20 components are defined by the composition of its 20 amino acids. Based on this, a new method, the so-called maximum component coefficient method, is proposed for predicting the structural class of a protein according to its amino acid composition. In comparison with the existing methods, the new method yields a higher general accuracy of prediction. Especially for the all-alpha proteins, the rate of correct prediction obtained by the new method is much higher than that by any of the existing methods. For instance, for the 19 all-alpha proteins investigated previously by P.Y. Chou, the rate of correct prediction by means of his method was 84.2%, but the correct rate when predicted with the new method would be 100%! Furthermore, the new method is characterized by an explicable physical picture. This is reflected by the process in which the vector representing a protein to be predicted is decomposed into four component vectors, each of which corresponds to one of the norms of the four protein structural classes.  相似文献   

11.
Bacteriophage T4 alpha- and beta-glucosyltransferases link glucosyl units to the 5-HMdC residues of its DNA. The monoglucosyl group in alpha-linkage predominates over the one in beta linkage. Having recently reported on the nucleotide sequence of gene alpha gt (1) we now determined the nucleotide sequence of gene beta gt. The genes were each cloned on a high expression vector under the control of the lambda pL promoter. After thermo-induction the proteins were isolated and purified to homogeneity. To verify that the translational starting sites and the proposed reading frames are effective in vivo the sequence of the first 31 amino acid residues from gp alpha gt and the first 30 amino acid residues from gp beta gt were determined by Edman degradation. The primary structures of the two proteins seem to have only limited structural similarities. The results are discussed comparing secondary structure predictions and homologies with other proteins from the protein sequence database of the Protein Identification Resource.  相似文献   

12.
The 60S ribosomal subunits from Saccharomyces cerevisiae contain a set of four acidic proteins named YP1alpha, YP1beta, YP2alpha, and YP2beta. The genes for each were PCR amplified from a yeast cDNA library, sequenced, and expressed in Escherichia coli cells using two expression systems. The first system, pLM1, was used for YP1beta, YP2alpha, and YP2beta. The second one, pT7-7, was used for YP1alpha. Expression in both cases was under the control of a strong inducible T7 promoter. The amount of induced recombinant proteins in the host cells was around 10 to 20% of the total soluble bacterial proteins. A new protocol for purification of all four recombinant proteins was established. The preliminary steps of purification were done by ammonium sulfate precipitation (YP1alpha, YP1beta) or NH4Cl/ethanol extraction (YP2alpha, YP2beta). The recombinant proteins were then purified to apparent homogeneity by only two steps of classical chromatographies, ion exchange (DEAE-cellulose) and gel filtration (Sephacryl S-200). Isoelectrofocusing analysis of YP2alpha and YP2beta showed the pIs of the recombinant proteins are the same as that of the native yeast ribosomal P2 proteins. The pI of YP1alpha is changed due to the addition of five amino acids attached to the N-terminus of recombinant polypeptide from the expression vector. YP1beta was obtained as a truncated form of polypeptide, similar to its ribosomal counterpart, YP1beta'. This was proved by isoelectrofocusing gel analysis.  相似文献   

13.
Tobi D 《Proteins》2012,80(4):1167-1176
A novel methodology for comparison of protein dynamics is presented. Protein dynamics is calculated using the Gaussian network model and the modes of motion are globally aligned using the dynamic programming algorithm of Needleman and Wunsch, commonly used for sequence alignment. The alignment is fast and can be used to analyze large sets of proteins. The methodology is applied to the four major classes of the SCOP database: "all alpha proteins," "all beta proteins," "alpha and beta proteins," and "alpha/beta proteins". We show that different domains may have similar global dynamics. In addition, we report that the dynamics of "all alpha proteins" domains are less specific to structural variations within a given fold or superfamily compared with the other classes. We report that domain pairs with the most similar and the least similar global dynamics tend to be of similar length. The significance of the methodology is that it suggests a new and efficient way of mapping between the global structural features of protein families/subfamilies and their encoded dynamics.  相似文献   

14.
Standard conformations of a polypeptide chain in irregular protein regions   总被引:1,自引:0,他引:1  
A detailed stereochemical analysis of known protein structures has been made which shows that: (1) irregular regions of proteins consist of a limited number of standard structures formed by three, four of more residues; (2) an amino acid residue of a protein can adopt one of the six sterically allowed conformations designated here as alpha, alpha L, beta, gamma, delta, and epsilon. It is shown that there are two allowed conformations of a polypeptide chain at the N-end of an alpha-helix, beta alpha n- and beta gamma alpha n-conformations, where n is a number of residues in the alpha-helix. At the C-end of the alpha-helix there are two conformations as well, alpha n gamma beta- and alpha n gamma alpha L beta-ones. Two beta-strands in a beta-hairpin can be joined, for example, by standard structures with beta beta alpha L beta-, beta alpha gamma alpha L beta-, beta alpha alpha gamma alpha L beta-conformations which are referred to as turns. In the regions where a polypeptide chain passes from one layer to another there are standard structures with beta gamma beta-, beta alpha beta beta-, beta alpha gamma beta-conformations etc., referred to as cross-overs. A structure of any protein irregular region can be represented as a combination of these and other standard turns and cross-overs considered in the paper. The major part of the turns and cross-overs has residues in alpha L- or epsilon-conformations which must be glycine or other residues with small or flexible side chains. Massive hydrophobic residues must not occupy the first beta-positions of the most standard structures. The results obtained can be successfully applied for prediction of the location of the turns and cross-overs in proteins from their amino acid sequences and for interpretation of electron density maps.  相似文献   

15.
Short-range and long-range contacts are important in forming protein structure. The proteins can be grouped into four different structural classes according to the content and topology of alpha-helices and beta-strands, and there are all-alpha, all-beta, alpha/beta and alpha+beta proteins. However, there is much difference in statistical property for those classes of proteins. In this paper, we will discuss protein structure in the view of the relative number of long-range (short-range) contacts for each residue. We find the percentage of residues having a large number of long-range contacts in protein is small in all-alpha class of proteins, and large in all-beta class of proteins. However, the percentage of residues is almost the same in alpha/beta and alpha+beta classes of proteins. We calculate the percentage of residues having the number of long-range contacts greater than or equal to (>/=) N(L)=5, and 7 for 428 proteins. The average percentage is 13.3%, 54.8%, 41.4% and 37.0% for all-alpha, all-beta, alpha/beta and alpha+beta classes of proteins with N(L)=5, respectively. With N(L) increasing, the percentage decreases, especially for all-alpha class of proteins. In the meantime, the percentage of residues having the number of short-range contacts greater than or equal to N(S) (>/=N(S)) in protein samples is large for all-alpha class of proteins, and small for all-beta class of proteins, especially for large N(S). We also investigate the ability of amino residues in forming a large number of long-range and short-range contacts. Cys, Val, Ile, Tyr, Trp and Phe can form a large number of long-range contacts easily, and Glu, Lys, Asp, Gln, Arg and Asn can form a large number of long-range contacts, but with difficulty. We also discuss the relative ability in forming short-range contacts for 20 amino residues. Comparison with Fauchere-Pliska hydrophobicity scale and the percentage of residues having large number of long-range contacts is also made. This investigation can provide some insights into the protein structure.  相似文献   

16.
Four isozymes of bile salt hydrolase (BSH) have been purified from the cytosol of cells of Lactobacillus sp. strain 100-100. The four proteins were designated BSH A, B, C, and D. They eluted from anion-exchange high-pressure liquid chromatography columns at 0.15, 0.18, 0.21, and 0.25 M NaCl, respectively. They are catalytically similar, except that the Vmax of BSH D is about 10-fold lower than those of the other three isozymes. All four proteins consist of one or two polypeptides. The peptides have molecular weights of 42,000 and 38,000 and are designated alpha and beta, respectively. The approximate native molecular weights of BSH A, B, C, and D are 115,000, 105,000, 95,000, and 80,000, respectively. The native proteins are probably trimers; the four isozymes are the array of possible subunit combinations alpha 3, alpha 2 beta 1, alpha 1 beta 2, and beta 3 for A, B, C, and D, respectively. The two subunits are antigenically distinct. Polyclonal antibodies raised against BSH A (all alpha peptide) react in Western blots (immunoblots) only with proteins containing the alpha peptide; such antibodies raised against BSH D (all beta peptide) react only with proteins containing the beta peptide. The amino acid compositions of the two peptides differ. This is the first report of a bacterium that makes four BSH isozymes.  相似文献   

17.
The cytoplasmic, NAD-linked hydrogenase of the Gram-positive hydrogen-oxidizing bacterium Nocardia opaca 1b was compared with the analogous enzyme isolated from the Gram-negative bacterium Alcaligenes eutrophus H16. The hydrogenase of N. opaca 1b was purified by a new procedure applying chromatography on phenyl-Sepharose and DEAE-Sephacel with two columns in series. A homogeneous enzyme preparation with a specific activity of 74 mumol H2 oxidized.min-1.mg protein-1 and a yield of 32% was isolated. The A. eutrophus enzyme was purified as previously published. Both enzymes are tetrameric proteins composed of four non-identical subunits (alpha, beta, gamma, delta). The four subunits of both of these enzymes were separated and isolated as single polypeptides by preparative polyacrylamide gel electrophoresis in the presence of sodium dodecyl sulfate. Immunological comparison of the four subunits of the Nocardia hydrogenase with those of the Alcaligenes enzyme showed that the alpha, beta, gamma, and delta subunits of one organism were serologically related to the analogous subunits of the other organism. Among themselves, the four subunits do not have any serological relationship. The eight individual polypeptides were also compared with respect to the NH2-terminal amino acid sequences determined by automated Edman degradation and to the amino acid compositions. Strong sequence similarities exist between the analogous subunits isolated from the two bacteria. Within the established N-terminal sequences the similarities between both alpha, beta, gamma and delta subunits amount to 63%, 79%, 80% and 65%, respectively. No similarities exist between the different, non-analogous subunits alpha, beta, gamma and delta.  相似文献   

18.
19.
20.
Parisi G  Echave J 《Gene》2005,345(1):45-53
The Structurally Constrained Protein Evolution (SCPE) model simulates protein evolution by introducing random mutations into the evolving sequences and selecting them against too much structural perturbation. Given a single protein structure, the SCPE model can be used to obtain a whole set of site-dependent amino acid substitution matrices. The set of SCPE substitution matrices for a given protein family can be seen as an independent-sites model of evolution for that family. Thus, these matrices can be compared with other substitution-matrix-based models of evolution. So far, SCPE has been tested only on left-handed parallel beta helix (LbetaH) proteins. Here, we address the question of generality by assessing the SCPE model on representatives of the four main classes of folds: alpha, beta, alpha+beta, and alpha/beta. We compare with other models using the likelihood ratio test with parametric bootstrapping. We show that SCPE performs better than the popular JTT model for all cases considered. Furthermore, by considering the relative contributions of mutation and selection, we found that the key to the success of the SCPE model is the selection step.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号