首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A protein is generally classified into one of the following four structural classes: all alpha, all beta, alpha+beta and alpha/beta. In this paper, based on the weighting to the 20 constituent amino acids, a new method is proposed for predicting the structural class of a protein according to its amino acid composition. The 20 weighting parameters, which reflect the different properties of the 20 constituent amino acids, have been obtained from a training set of proteins through the linear-programming approach. The rate of correct prediction for a training set of proteins by means of the new method was 100%, whereas the highest rate of previous methods was 82.8%. Furthermore, the results showed that the more numerous training proteins, the more effective the new method.  相似文献   

2.
A protein is usually classified into one of the following four structural classes: all alpha, all beta, (alpha + beta) and alpha/beta. In this paper, based on the maximum correlation-coefficient principle, a new formulation is proposed for predicting the structural class of a protein according to its amino acid composition. Calculations have been made for a development set of proteins from which the amino acid compositions for the standard structural classes were derived, and an independent set of proteins which are outside the development set. The former can test the self consistency of a method and the latter can test its extrapolating effectiveness. In both cases, the results showed that the new method gave a considerably higher rate of correct prediction than any of the previous methods, implying that a significant improvement has been achieved by implementing the maximum-correlation-coefficient principle in the new method.  相似文献   

3.
Prediction of protein structural class by discriminant analysis   总被引:7,自引:0,他引:7  
Protein structural class--alpha, beta, mixed (alpha/beta or alpha + beta), irregular--can be predicted from the amino acid sequence by discriminant analysis. Discrimination is based on distributions, in the classes, of vectors of attributes characterizing the sequences. In this paper, two sets of attributes and two methods of estimating their distributions are compared using more than 100 proteins from the Protein Data Bank. The best results were obtained when canonical variates of the frequencies of occurrence of 20 amino acids and non-parametric estimates of their distributions were used. Three variates are sufficient to allocate proteins to one of four classes with 83% reliability (estimated by cross-validation) and four variates allowed allocation to one of five classes with 78% reliability.  相似文献   

4.
Here we present a systematic analysis of accessible surface areas and hydrogen bonds of 2554 globular proteins from four structural classes (all-α, all-β, α/β and α+β proteins) that is aimed to learn in which structural class the accessible surface area increases with increasing protein molecular mass more rapidly than in other classes, and what structural peculiarities are responsible for this effect. The beta structural class of proteins was found to be the leader, with the following possible explanations of this fact. First, in beta structural proteins, the fraction of residues not included in the regular secondary structure is the largest, and second, the accessible surface area of packaged elements of the beta-structure increases more rapidly with increasing molecular mass in comparison with the alpha-structure. Moreover, in the beta structure, the probability of formation of backbone hydrogen bonds is higher than that in the alpha helix for all residues of α+β proteins (the average probability is 0.73±0.01 for the beta-structure and 0.60±0.01 for the alpha-structure without proline) and α/β proteins, except for asparagine, aspartic acid, glycine, threonine, and serine (0.70±0.01 for the beta-structure and 0.60±0.01 for the alpha-structure without the proline residue). There is a linear relationship between the number of hydrogen bonds and the number of amino acid residues in the protein (Number of hydrogen bonds=0.678·number of residues-3.350).  相似文献   

5.
The amino acid sequences of pike eel gonadotropin alpha and beta subunits have been determined by standard sequencing analytical methods. The alpha subunit is composed of 93 amino acid residues while the beta subunit comprises 113 amino acid residues. All the invariant half-cystine residues are in the same positions as those found in other gonadotropins. It is noteworthy that the first, putative glycosylation site (Asn56) found in the alpha subunit of other gonadotropins was replaced by Asp56 in the alpha subunit of pike eel gonadotropin. Similarity analyses indicate that both subunits are structurally more similar to other known fish gonadotropin subunits than to those of the mammalian gonadotropins.  相似文献   

6.
Wang ZX  Yuan Z 《Proteins》2000,38(2):165-175
Proteins of known structures are usually classified into four structural classes: all-alpha, all-beta, alpha+beta, and alpha/beta type of proteins. A number of methods to predicting the structural class of a protein based on its amino acid composition have been developed during the past few years. Recently, a component-coupled method was developed for predicting protein structural class according to amino acid composition. This method is based on the least Mahalanobis distance principle, and yields much better predicted results in comparison with the previous methods. However, the success rates reported for structural class prediction by different investigators are contradictory. The highest reported accuracies by this method are near 100%, but the lowest one is only about 60%. The goal of this study is to resolve this paradox and to determine the possible upper limit of prediction rate for structural classes. In this paper, based on the normality assumption and the Bayes decision rule for minimum error, a new method is proposed for predicting the structural class of a protein according to its amino acid composition. The detailed theoretical analysis indicates that if the four protein folding classes are governed by the normal distributions, the present method will yield the optimum predictive result in a statistical sense. A non-redundant data set of 1,189 protein domains is used to evaluate the performance of the new method. Our results demonstrate that 60% correctness is the upper limit for a 4-type class prediction from amino acid composition alone for an unknown query protein. The apparent relatively high accuracy level (more than 90%) attained in the previous studies was due to the preselection of test sets, which may not be adequately representative of all unrelated proteins.  相似文献   

7.
A lactococcal bacteriocin, termed lactococcin G, was purified to homogeneity by a simple four-step purification procedure that includes ammonium sulfate precipitation, binding to a cation exchanger and octyl-Sepharose CL-4B, and reverse-phase chromatography. The final yield was about 20%, and nearly a 7,000-fold increase in the specific activity was obtained. The bacteriocin activity was associated with three peptides, termed alpha 1, alpha 2, and beta, which were separated by reverse-phase chromatography. Judging from their amino acid sequences, alpha 1 and alpha 2 were the same gene product. Differences in their configurations presumably resulted in alpha 2 having a slightly lower affinity for the reverse-phase column than alpha 1 and a reduced bacteriocin activity when combined with beta. Bacteriocin activity required the complementary action of both the alpha and the beta peptides. When neither alpha 1 nor beta was in excess, about 0.3 nM alpha 1 and 0.04 nM beta induced 50% growth inhibition, suggesting that they might interact in a 7:1 or 8:1 ratio. As judged by the amino acid sequence, alpha 1 has an isoelectric point of 10.9, an extinction coefficient of 1.3 x 10(4) M-1 cm-1, and a molecular weight of 4,346 (39 amino acid residues long). Similarly, beta has an isoelectric point of 10.4, an extinction coefficient of 2.4 x 10(4) M-1 cm-1, and a molecular weight of 4110 (35 amino acid residues long). Molecular weights of 4,376 and 4,109 for alpha 1 and beta, respectively, were obtained by mass spectrometry. The N-terminal halves of both the alpha and beta peptides may form amphiphilic alpha-helices, suggesting that the peptides are pore-forming toxins that create cell membrane channels through a "barrel-stave" mechanism. The C-terminal halves of both peptides consist largely of polar amino acids.  相似文献   

8.
Proteins are generally classified into four structural classes: all-alpha proteins, all-beta proteins, alpha + beta proteins, and alpha/beta proteins. In this article, a protein is expressed as a vector of 20-dimensional space, in which its 20 components are defined by the composition of its 20 amino acids. Based on this, a new method, the so-called maximum component coefficient method, is proposed for predicting the structural class of a protein according to its amino acid composition. In comparison with the existing methods, the new method yields a higher general accuracy of prediction. Especially for the all-alpha proteins, the rate of correct prediction obtained by the new method is much higher than that by any of the existing methods. For instance, for the 19 all-alpha proteins investigated previously by P.Y. Chou, the rate of correct prediction by means of his method was 84.2%, but the correct rate when predicted with the new method would be 100%! Furthermore, the new method is characterized by an explicable physical picture. This is reflected by the process in which the vector representing a protein to be predicted is decomposed into four component vectors, each of which corresponds to one of the norms of the four protein structural classes.  相似文献   

9.
Amino acid sequence of porcine heart fumarase   总被引:3,自引:0,他引:3  
The complete amino acid sequence of porcine heart fumarase (EC 4.2.1.2) has been determined from peptides produced by cyanogen bromide, endoproteinase Arg-C, S. aureus V8 protease, and trypsin. The enzyme is a tetramer of identical subunits with Mr = 50,015 and composed of 466 amino acid residues. Porcine heart fumarase displays 96% identity to human liver fumarase. Prediction of the secondary structural elements of porcine fumarase indicate that the enzyme contains a large amount of alpha helix with very little beta structure.  相似文献   

10.
Ofran Y  Margalit H 《Proteins》2006,64(1):275-279
It is well established that there is a relationship between the amino acid composition of a protein and its structural class (i.e., alpha, beta, alpha + beta, or alpha/beta). Several studies have even shown the power of amino acid composition in predicting the secondary structure class of a protein. Herein, we show that significant similarity in amino acid composition exists not only between proteins of the same class, but even between proteins of the same fold. To test conjectural explanations for this phenomenon, we analyzed a set of structurally similar proteins that are dissimilar in sequence. Based on this analysis, we suggest that specific residues that are involved in intramolecular interactions may account for this surprising relationship between composition and structure.  相似文献   

11.
Short-range and long-range contacts are important in forming protein structure. The proteins can be grouped into four different structural classes according to the content and topology of alpha-helices and beta-strands, and there are all-alpha, all-beta, alpha/beta and alpha+beta proteins. However, there is much difference in statistical property for those classes of proteins. In this paper, we will discuss protein structure in the view of the relative number of long-range (short-range) contacts for each residue. We find the percentage of residues having a large number of long-range contacts in protein is small in all-alpha class of proteins, and large in all-beta class of proteins. However, the percentage of residues is almost the same in alpha/beta and alpha+beta classes of proteins. We calculate the percentage of residues having the number of long-range contacts greater than or equal to (>/=) N(L)=5, and 7 for 428 proteins. The average percentage is 13.3%, 54.8%, 41.4% and 37.0% for all-alpha, all-beta, alpha/beta and alpha+beta classes of proteins with N(L)=5, respectively. With N(L) increasing, the percentage decreases, especially for all-alpha class of proteins. In the meantime, the percentage of residues having the number of short-range contacts greater than or equal to N(S) (>/=N(S)) in protein samples is large for all-alpha class of proteins, and small for all-beta class of proteins, especially for large N(S). We also investigate the ability of amino residues in forming a large number of long-range and short-range contacts. Cys, Val, Ile, Tyr, Trp and Phe can form a large number of long-range contacts easily, and Glu, Lys, Asp, Gln, Arg and Asn can form a large number of long-range contacts, but with difficulty. We also discuss the relative ability in forming short-range contacts for 20 amino residues. Comparison with Fauchere-Pliska hydrophobicity scale and the percentage of residues having large number of long-range contacts is also made. This investigation can provide some insights into the protein structure.  相似文献   

12.
M Shimamura  Y Inoue  S Inoue 《Biochemistry》1985,24(20):5470-5480
Structures of glycopeptides obtained by exhaustive Pronase digestion of high molecular weight (1.7 X 10(5)) salmon egg polysialoglycoprotein have been elucidated. Six principal glycopeptides isolated by gel chromatography and DEAE-Sephadex A-25 chromatography in the absence or presence of borate ion were analyzed for their carbohydrate and amino acid composition, as well as amino acid sequence, and found to be of two distinct types: glycotripeptides, Thr*-Ser*-Glu, and glycotetrapeptides, Thr*-Gly-Pro-Ser, where an asterisk indicates the amino acid residues to which either the Gal beta 1----3GalNAc or Fuc alpha 1----3GalNAc beta 1----3Gal beta 1----4Gal beta 1----3GalNAc chain is attached. Their final yield corresponds to 64% of the original desialylated glycoprotein. In view of the simple amino acid composition of salmon egg polysialoglycoprotein (molar ratio Asp2Thr2Ser3Glu1Pro1Gly1Ala3) and the result of alkaline beta-elimination indicating three carbohydrate units linked to two of two threonine and one of three serine residues, a unique primary structure comprising repetitive sequences of the above two types of glycopeptides, which are interspersed by short nonglycosylated peptides consisting of alanine and aspartic acid, has been proposed for the core protein. The molecular secondary ion mass spectra of underivatized glycopeptides were used to obtain their structural information. The anomeric configuration of the proximal sugar-peptide linkages was proven to be alpha by proton nuclear magnetic resonance spectroscopy. This is the first systematic reported study of O-glycosidically linked glycopeptides by these instrumental methods.  相似文献   

13.
Aligning protein sequences using a score matrix has became a routine but valuable method in modern biological research. However, alignment in the ‘twilight zone’ remains an open issue. It is feasible and necessary to construct a new score matrix as more protein structures are resolved. Three structural class-specific score matrices (all-alpha, allbeta and alpha/beta) were constructed based on the structure alignment of low identity proteins of the corresponding structural classes. The class-specific score matrices were significantly better than a structure-derived matrix (HSDM) and three other generalized matrices (BLOSUM30, BLOSUM60 and Gonnet250) in alignment performance tests. The optimized gap penalties presented here also promote alignment performance. The results indicate that different protein classes have distinct amino acid substitution patterns, and an amino acid score matrix should be constructed based on different structural classes. The class-specific score matrices could also be used in profile construction to improve homology detection.  相似文献   

14.
Two structural classes of dual alpha4beta1/alpha4beta7 integrin antagonists were investigated via solid-phase parallel synthesis. Using an acylated amino acid backbone, lead compounds containing biphenylalanine or tyrosine carbamate scaffolds were optimized for inhibition of alpha4beta1/VCAM and alpha4beta7/MAdCAM. A comparison of the structure-activity relationships in the inhibition of the alpha4beta7/MAdCAM interaction for substituted amines employed in both scaffolds suggests a similar binding mode for the compounds.  相似文献   

15.
An alpha-helix and a beta-strand are said to be interactively packed if at least one residue in each of the secondary structural elements loses 10% of its solvent accessible contact area on association with the other secondary structural element. An analysis of all such 5,975 nonidentical alpha/beta units in protein structures, defined at < or = 2.5 A resolution, shows that the interaxial distance between the alpha-helix and the beta-strand is linearly correlated with the residue-dependent function, log[(V/nda)/n-int], where V is the volume of amino acid residues in the packing interface, nda is the normalized difference in solvent accessible contact area of the residues in packed and unpacked secondary structural elements, and n-int is the number of residues in the packing interface. The beta-sheet unit (beta u), defined as a pair of adjacent parallel or antiparallel hydrogen-bonded beta-strands, packing with an alpha-helix shows a better correlation between the interaxial distance and log(V/nda) for the residues in the packing interface. This packing relationship is shown to be useful in the prediction of interaxial distances in alpha/beta units using the interacting residue information of equivalent alpha/beta units of homologous proteins. It is, therefore, of value in comparative modeling of protein structures.  相似文献   

16.
Discriminant analysis assigns objects to one of several classes on the basis of attributes which characterize the objects. The success of classification depends on the selection of discriminatory attributes and on the choice of an assignment rule. In this paper we focus on the latter and discuss ways to obtain nonlinear classification rules through maximum likelihood, canonical components and projection pursuit. We use both linear and nonlinear methods to classify proteins into three secondary structural types: alpha, beta, and mixed alpha and beta or irregular. Using simple attributes, dependent on amino acid properties, we show that the rate of incorrect classification can be decreased by more than 15% when nonlinear methods are used.  相似文献   

17.
The two cardiac myosin heavy chain isoforms, alpha and beta, differ functionally, alpha Myosin exhibits higher actin-activated ATPase than does beta myosin, and hearts expressing alpha myosin exhibit increased contractility relative to hearts expressing beta myosin. To understand the molecular basis for this functional difference, we determined the complete nucleotide sequence of full-length rat alpha and beta myosin heavy chain cDNAs. This study represents the first opportunity to compare full-length fast ATPase and slow ATPase muscle myosin sequences. The alpha and beta myosin heavy chain amino acid sequences are more related to each other than to other sarcomeric myosin heavy chain sequences. Of the 1938 amino acid residues in alpha and beta myosin heavy chain, 131 are non-identical with 37 non-conservative changes. Two-thirds of these non-identical residues are clustered, and several of these clusters map to regions that have been implicated as functionally important. Some of the regions identified by the clusters of non-identical amino acid residues may affect actin binding, ATP hydrolysis and force production.  相似文献   

18.
Standard conformations of a polypeptide chain in irregular protein regions   总被引:1,自引:0,他引:1  
A detailed stereochemical analysis of known protein structures has been made which shows that: (1) irregular regions of proteins consist of a limited number of standard structures formed by three, four of more residues; (2) an amino acid residue of a protein can adopt one of the six sterically allowed conformations designated here as alpha, alpha L, beta, gamma, delta, and epsilon. It is shown that there are two allowed conformations of a polypeptide chain at the N-end of an alpha-helix, beta alpha n- and beta gamma alpha n-conformations, where n is a number of residues in the alpha-helix. At the C-end of the alpha-helix there are two conformations as well, alpha n gamma beta- and alpha n gamma alpha L beta-ones. Two beta-strands in a beta-hairpin can be joined, for example, by standard structures with beta beta alpha L beta-, beta alpha gamma alpha L beta-, beta alpha alpha gamma alpha L beta-conformations which are referred to as turns. In the regions where a polypeptide chain passes from one layer to another there are standard structures with beta gamma beta-, beta alpha beta beta-, beta alpha gamma beta-conformations etc., referred to as cross-overs. A structure of any protein irregular region can be represented as a combination of these and other standard turns and cross-overs considered in the paper. The major part of the turns and cross-overs has residues in alpha L- or epsilon-conformations which must be glycine or other residues with small or flexible side chains. Massive hydrophobic residues must not occupy the first beta-positions of the most standard structures. The results obtained can be successfully applied for prediction of the location of the turns and cross-overs in proteins from their amino acid sequences and for interpretation of electron density maps.  相似文献   

19.
S Z Wang  J S Chen  J L Johnson 《Biochemistry》1988,27(8):2800-2810
Nitrogenase is composed of two separately purified proteins, a molybdenum-iron (MoFe) protein and an iron (Fe) protein. Structural genes (nifD and nifK) encoding alpha and beta subunits of the MoFe protein of Clostridium pasteurianum (Cp) have been cloned and sequenced. The deduced amino acid sequences were analyzed for structures that could be related to the unique properties of the Cp protein, particularly its low capacity to form an active enzyme with a heterologous Fe protein. Cp nifK is located immediately downstream from Cp nifD, with the start codon of nifK overlapping by one base with the stop codon of nifD. An open reading frame following nifK was identified as nifE. The amino acid sequence deduced from nifK encompasses the partial amino acid sequences previously reported from the isolated beta subunit. Cp nifK encodes a polypeptide of 458 amino acid residues (Mr 50 115) whose amino-terminal region is about 50 residues shorter than the otherwise conserved corresponding polypeptides from four other organisms. In contrast, Cp alpha subunit (nifD product) contains an additional stretch of 50 amino acid residues in the 380-430 region, which is unique to the Cp protein. It therefore appears that the combined size of the alpha and beta subunits could be important to nitrogenase function. An analysis of the predicted secondary structure from the amino acid sequence of each subunit from three species (C. pasteurianum, Azotobacter vinelandii, and Rhizobium japonicum) further revealed structural features, including regions adjacent to some of the conserved cysteine residues, differentiating the Cp MoFe protein from others. These different regions may be further tested for correlation with distinct properties of Cp nitrogenase.  相似文献   

20.
Haemoglobin from donkey was purified and crystallized in space group C2. The present donkey haemoglobin model comprises of two subunits alpha and beta. These alpha and beta subunits comprise of 141 and 146 amino acid residues, respectively, and the haem groups. The donkey haemoglobin differs from horse only in two amino acids of alpha-chain (His20 to Asn and Tyr24 to Phe) and these substitutions do not significantly change the secondary structural features of donkey haemoglobin. The haem group region and subunit contacts are closely resemble with that of horse methaemoglobin.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号