首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Amino acid propensities for secondary structures were used since the 1970s, when Chou and Fasman evaluated them within datasets of few tens of proteins and developed a method to predict secondary structure of proteins, still in use despite prediction methods having evolved to very different approaches and higher reliability. Propensity for secondary structures represents an intrinsic property of amino acid, and it is used for generating new algorithms and prediction methods, therefore our work has been aimed to investigate what is the best protein dataset to evaluate the amino acid propensities, either larger but not homogeneous or smaller but homogeneous sets, i.e., all-alpha, all-beta, alpha-beta proteins. As a first analysis, we evaluated amino acid propensities for helix, beta-strand, and coil in more than 2000 proteins from the PDBselect dataset. With these propensities, secondary structure predictions performed with a method very similar to that of Chou and Fasman gave us results better than the original one, based on propensities derived from the few tens of X-ray protein structures available in the 1970s. In a refined analysis, we subdivided the PDBselect dataset of proteins in three secondary structural classes, i.e., all-alpha, all-beta, and alpha-beta proteins. For each class, the amino acid propensities for helix, beta-strand, and coil have been calculated and used to predict secondary structure elements for proteins belonging to the same class by using resubstitution and jackknife tests. This second round of predictions further improved the results of the first round. Therefore, amino acid propensities for secondary structures became more reliable depending on the degree of homogeneity of the protein dataset used to evaluate them. Indeed, our results indicate also that all algorithms using propensities for secondary structure can be still improved to obtain better predictive results.  相似文献   

2.
To study local structures in proteins, we previously developed an autoassociative artificial neural network (autoANN) and clustering tool to discover intrinsic features of macromolecular structures. The hidden unit activations computed by the trained autoANN are a convenient low-dimensional encoding of the local protein backbone structure. Clustering these activation vectors results in a unique classification of protein local structural features called Structural Building Blocks (SBBs). Here we describe application of this method to a larger database of proteins, verification of the applicability of this method to structure classification, and subsequent analysis of amino acid frequencies and several commonly occurring patterns of SBBs. The SBB classification method has several interesting properties: 1) it identifies the regular secondary structures, α helix and β strand; 2) it consistently identifies other local structure features (e.g., helix caps and strand caps); 3) strong amino acid preferences are revealed at some positions in some SBBs; and 4) distinct patterns of SBBs occur in the “random coil” regions of proteins. Analysis of these patterns identifies interesting structural motifs in the protein backbone structure, indicating that SBBs can be used as “building blocks” in the analysis of protein structure. This type of pattern analysis should increase our understanding of the relationship between protein sequence and local structure, especially in the prediction of protein structures. © 1997 Wiley-Liss, Inc.  相似文献   

3.
A suite of FORTRAN programs, PREF, is described for calculating preference functions from the data base of known protein structures and for comparing smoothed profiles of sequence-dependent preferences in proteins of unknown structure. Amino acid preferences for a secondary structure are considered as functions of a sequence environment. Sequence environment of amino acid residue in a protein is defined as an average over some physical, chemical, or statistical property of its primary structure neighbors. The frequency distribution of sequence environments in the data base of soluble protein structures is approximately normal for each amino acid type of known secondary conformation. An analytical expression for the dependence of preferences on sequence environment is obtained after each frequency distribution is replaced by corresponding Gaussian function. The preference for the α-helical conformation increases for each amino acid type with the increase of sequence environment of buried solvent-accessible surface areas. We show that a set of preference functions based on buried surface area is useful for predicting folding motifs in α-class proteins and in integral membrane proteins. The prediction accuracy for helical residues is 79% for 5 integral membrane proteins and 74% for 11 α-class soluble proteins. Most residues found in transmembrane segments of membrane proteins with known α-helical structure are predicted to be indeed in the helical conformation because of very high middle helix preferences. Both extramembrane and transmembrane helices in the photosynthetic reaction center M and L subunits are correctly predicted. We point out in the discussion that our method of conformational preference functions can identify what physical properties of the amino acids are important in the formation of particular secondary structure elements. © 1993 John Wiley & Sons, Inc.  相似文献   

4.
Biased usage of synonymous codons has been elucidated under the perspective of cellular tRNA abundance for quite a long time now. Taking advantage of publicly available gene expression data for Saccharomyces cerevisiae, a systematic analysis of the codon and amino acid usages in two different coding regions corresponding to the regular (helix and strand) as well as the irregular (coil) protein secondary structures, have been performed. Our analyses suggest that apart from tRNA abundance, mRNA folding stability is another major evolutionary force in shaping the codon and amino acid usage differences between the highly and lowly expressed genes in S. cerevisiae genome and surprisingly it depends on the coding regions corresponding to the secondary structures of the encoded proteins. This is obviously a new paradigm in understanding the codon usage in S. cerevisiae. Differential amino acid usage between highly and lowly expressed genes in the regions coding for the irregular protein secondary structure in S. cerevisiae is expounded by the stability of the mRNA folded structure. Irrespective of the protein secondary structural type, the highly expressed genes always tend to encode cheaper amino acids in order to reduce the overall biosynthetic cost of production of the corresponding protein. This study supports the hypothesis that the tRNA abundance is a consequence of and not a reason for the biased usage of amino acid between highly and lowly expressed genes.  相似文献   

5.
The nucleotide frequencies in the second codon positions of genes are remarkably different for the coding regions that correspond to different secondary structures in the encoded proteins, namely, helix, beta-strand and aperiodic structures. Indeed, hydrophobic and hydrophilic amino acids are encoded by codons having U or A, respectively, in their second position. Moreover, the beta-strand structure is strongly hydrophobic, while aperiodic structures contain more hydrophilic amino acids. The relationship between nucleotide frequencies and protein secondary structures is associated not only with the physico-chemical properties of these structures but also with the organisation of the genetic code. In fact, this organisation seems to have evolved so as to preserve the secondary structures of proteins by preventing deleterious amino acid substitutions that could modify the physico-chemical properties required for an optimal structure.  相似文献   

6.
Understanding the key factors that influence the interaction preferences of amino acids in the folding of proteins have remained a challenge. Here we present a knowledge‐based approach for determining the effective interactions between amino acids based on amino acid type, their secondary structure, and the contact based environment that they find themselves in the native state structure as measured by their number of neighbors. We find that the optimal information is approximately encoded in a 60 × 60 matrix describing the 20 types of amino acids in three distinct secondary structures (helix, beta strand, and loop). We carry out a clustering scheme to understand the similarity between these interactions and to elucidate a nonredundant set. We demonstrate that the inferred energy parameters can be used for assessing the fit of a given sequence into a putative native state structure.  相似文献   

7.
Circular dichroism (CD) spectroscopy is a valuable method for defining canonical secondary structure contents of proteins based on empirically‐defined spectroscopic signatures derived from proteins with known three‐dimensional structures. Many proteins identified as being “Intrinsically Disordered Proteins” have a significant amount of their structure that is neither sheet, helix, nor turn; this type of structure is often classified by CD as “other”, “random coil”, “unordered”, or “disordered”. However the “other” category can also include polyproline II (PPII)‐type structures, whose spectral properties have not been well‐distinguished from those of unordered structures. In this study, synchrotron radiation circular dichroism spectroscopy was used to investigate the spectral properties of collagen and polyproline, which both contain PPII‐type structures. Their native spectra were compared as representatives of PPII structures. In addition, their spectra before and after treatment with various conditions to produce unfolded or denatured structures were also compared, with the aim of defining the differences between CD spectra of PPII and disordered structures. We conclude that the spectral features of collagen are more appropriate than those of polyproline for use as the representative spectrum for PPII structures present in typical amino acid‐containing proteins, and that the single most characteristic spectroscopic feature distinguishing a PPII structure from a disordered structure is the presence of a positive peak around 220nm in the former but not in the latter. These spectra are now available for inclusion in new reference data sets used for CD analyses of the secondary structures of soluble proteins.  相似文献   

8.
Common amino acid sequence domains among the LEA proteins of higher plants   总被引:41,自引:0,他引:41  
LEA proteins are late embryogenesis abundant in the seeds of many higher plants and are probably universal in occurrence in plant seeds. LEA mRNAs and proteins can be induced to appear at other stages in the plant's life by desiccation stress and/or treatment with the plant hormone abscisic acid (ABA). A role in protecting plant structures during water loss is likely for these proteins, with ABA functioning in the stress transduction process. Presented here are conserved tracts of amino acid sequence among LEA proteins from several species that may represent domains functionally important in desiccation protection. Curiously, an 11 amino acid sequence motif is found tandemly repeated in a group of LEA proteins of vastly different sizes. Analysis of this motif suggests that it exists as an amphiphilic helix which may serve as the basis for higher order structure.  相似文献   

9.
We present simulation results on a simple model to describe the hydrogen bonding in proteins with helical structures. The approximation distinguishes between ! helices, where each amino acid interacts with another one located four residues apart, 3 10 structures, where the number of amino acids in between is three, and the ? arrangement, in which that number is five. We found that the main features of the system are determined by the most stable structure (the ! helix) and that the other type of hydrogen bonds appears just below the denaturation temperature of the peptide. The probability of finding a 3 10 -type bond is greater at the beginning or at the end of the peptide chain, irrespectively of its length, while in short peptides the existence of those bonds increases appreciably the denaturation temperature, promoting stability. On the other hand, the temperature of denaturation decreases with the length of the peptide to reach a value independent of the number of amino acid residues.  相似文献   

10.
Analysis of 68 proteins from Protein Data Bank disclosed a new widely spread type of the secondary structure that is designated as mobile (M-) conformation. Helical parameters of M-conformation are close to the poly-L-proline II type helix. Its occurrence in globular proteins approximates that of the beta-sheet. The angles corresponding to the position of the M-conformation maximum in distribution of amino acid residues on a conformational map are phi: -65 degrees, psi: 140 degrees. Unique features and high occurrence in proteins make it possible to distinguish the M-conformation as an independent third type of the secondary structure in globular proteins, that should be included in the present classification.  相似文献   

11.
The model of formation of alpha-helices and beta-structures determined by joint action of the three elements: N-terminal, internal and C-terminal fragments are presented. Algorithm for calculation of their localization in a given amino acid sequence was constructed on the base of this model. The preference of the fragments of the amino acid sequence to a definite type of the secondary structure was estimated on the base of corresponding average values of linear discriminant functions dsk (s = alpha, beta, k = N, in, C). The latter were constructed in the previous paper on the base of the revealed significant characteristics. These integral characteristics are used for calculating the localisation of discrete secondary structures. The total prediction for 3 states (alpha, beta, c) given 71% correctly predicted residues (for 4 states alpha, beta, c, t) 62% for the training set, consisting of 72 proteins. For the control set (15 proteins) the accuracy of prediction is about 65%. The essential advantages of this method are: 1) the possibility to localize the discrete secondary structures; 2) the high accuracy of prediction of long secondary structures (for alpha-helices approximately 90%, for beta-structures approximately 80%), which is important for the determination of the protein folding. The influence of mutation on the secondary structure of proteins was investigated. The anormally high stability of the secondary structures of immunoglobulins to mutations was revealed. This probably results from the selection during evolution of such variants of amino acid sequences, which are able to provide the functional variability of antigenic determinants, but keep invariant the tertially structure of protein.  相似文献   

12.
Protein aggregation, being an outcome of improper protein folding, is largely dependent on the folding kinetics of a protein. Previous studies have reported a positive correlation between the stability of the secondary structural elements of a protein and their rate of folding/unfolding. In this in silico study, the secondary and tertiary structures of proteins a) that form inclusion bodies on overexpression in Escherichia coli, b) that form amyloid fibrils and c) that are soluble on overexpression in E. coli are analyzed for certain features that are known to be associated with structural stability. The study revealed that the soluble proteins seem to have a higher rate of folding (based on contact order) and a lower percentage of exposed hydrophobic residues as compared to the inclusion body forming or amyloidogenic proteins. The soluble proteins also seem to have a more favored helix and strand composition (based on the known secondary structural propensities of amino acids). The secondary structure analyses also reveal that the evolutionary pressure is directed against protein aggregation. This understanding of the positive correlation between structural stability and solubility, along with the other parameters known to influence aggregation, could be exploited in the design of mutations aimed at reducing the aggregation propensity of the proteins.  相似文献   

13.
The beta hairpin motif is a ubiquitous protein structural motif that can be found in molecules across the tree of life. This motif, which is also popular in synthetically designed proteins and peptides, is known for its stability and adaptability to broad functions. Here, we systematically probe all 49,000 unique beta hairpin substructures contained within the Protein Data Bank (PDB) to uncover key characteristics correlated with stable beta hairpin structure, including amino acid biases and enriched interstrand contacts. We find that position specific amino acid preferences, while seen throughout the beta hairpin structure, are most evident within the turn region, where they depend on subtle turn dynamics associated with turn length and secondary structure. We also establish a set of broad design principles, such as the inclusion of aspartic acid residues at a specific position and the careful consideration of desired secondary structure when selecting residues for the turn region, that can be applied to the generation of libraries encoding proteins or peptides containing beta hairpin structures.  相似文献   

14.
A priori knowledge of secondary structure content can be of great use in theoretical and experimental determination of protein structure. We present a method that uses two computer-simulated neural networks placed in "tandem" to predict the secondary structure content of water-soluble, globular proteins. The first of the two networks, NET1, predicts a protein's helix and strand content given information about the protein's amino acid composition, molecular weight and heme presence. Because NET1 contained more adjustable parameters (network weights) than learning examples, this network experienced problems with memorization, which is the inability to generalize onto new, never-seen-before examples. To overcome this problem, we designed a second network, NET2, which learned to determine when NET1 was in a state of generalization. Together, these two networks produce prediction errors as low as 5.0% and 5.6% for helix and strand content, respectively, on a set of protein crystal structures bearing little homology to those used in network training. A comparison between three other methods including a multiple linear regression analysis, a non-hidden-node network analysis and a secondary structure assignment analysis reveals that our tandem neural network scheme is, indeed, the best method for predicting secondary structure content. The results of our analysis suggest that the knowledge of sequence information is not necessary for highly accurate predictions of protein secondary structure content.  相似文献   

15.
Amino acid residues can be divided into similar groups by frequencies of interreplacements in the evolutionary pathway and by trends to spatial contacts at the tertiary structures of globular proteins. Each residue was compared to the cluster of spatial surrounding--the totality of residues spacially drawn together. 5210 clusters in 32 unhomologous proteins with established tertiary structure and 6447 clusters formed only by variables amino acid residues were analysed. Spatial contacts among residues were studied depending on the secondary structure and the amount of residues in a cluster. It was assumed that functionally admissible mutations may be defined, first of all, by the degree of neighboring of amino acid residues in the spatial surrounding.  相似文献   

16.
17.
A search for the occurrence of the rare pi-helix was performed with Iditis from the Oxford Molecular Group upon the Protein Data Bank. In 8 of the 10 confirmed crystal structures that harbor the pi-helix, its unique conformation has been linked directly to the formation or stabilization of a specific binding site within the protein. In the discussion to follow, the role for each of these eight pi-helices will be addressed in regard to protein function. It is clear upon closer examination that the conformation of the pi-helix has evolved to provide unique structural features within a variety of proteins.  相似文献   

18.
MOTIVATION: Data that characterize primary and tertiary structures of proteins are now accumulating at a rapid and accelerating rate and require automated computational tools to extract critical information relating amino acid changes with the spectrum of functionally attributes exhibited by a protein. We propose that immunoglobulin-type beta-domains, which are found in approximate 400 functionally distinct forms in humans alone, provide the immense genetic variation within limited conformational changes that might facilitate the development of new computational tools. As an initial step, we describe here an approach based on Support Vector Machine (SVM) technology to identify amino acid variations that contribute to the functional attribute of pathological self-assembly by some human antibody light chains produced during plasma cell diseases. RESULTS: We demonstrate that SVMs with selective kernel scaling are an effective tool in discriminating between benign and pathologic human immunoglobulin light chains. Initial results compare favorably against manual classification performed by experts and indicate the capability of SVMs to capture the underlying structure of the data. The data set consists of 70 proteins of human antibody kappa1 light chains, each represented by aligned sequences of 120 amino acids. We perform feature selection based on a first-order adaptive scaling algorithm, which confirms the importance of changes in certain amino acid positions and identifies other positions that are key in the characterization of protein function.  相似文献   

19.
Anbarasu A  Anand S  Rao S 《Bio Systems》2007,90(3):792-801
We have investigated the roles played by C-H...O=C interactions in RNA binding proteins. There was an average of 78 CH...O=C interactions per protein and also there was an average of one significant CH...O=C interactions for every 6 residues in the 59 RNA binding proteins studied. Main chain-Main chain (MM) CH...O=C interactions are the predominant type of interactions in RNA binding proteins. The donor atom contribution to CH...O=C interactions was mainly from aliphatic residues. The acceptor atom contribution for MM CH...O=C interactions was mainly from Val, Phe, Leu, Ile, Arg and Ala. The secondary structure preference analysis of CH...O=C interacting residues showed that, Arg, Gln, Glu and Tyr preferred to be in helix, while Ala, Asp, Cys, Gly, Ile, Leu, Lys, Met, Phe, Trp and Val preferred to be in strand conformation. Most of the CH...O=C interacting polar amino acid residues were solvent exposed while, majority of the CH...O=C interacting non polar residues were excluded from the solvent. Long and medium-range CH...O=C interactions are the predominant type of interactions in RNA binding proteins. More than 50% of CH...O=C interacting residues had a higher conservation score. Significant percentage of CH...O=C interacting residues had one or more stabilization centers. Sixty-six percent of the theoretically predicted stabilizing residues were also involved in CH...O=C interactions and hence these residues may also contribute additional stability to RNA binding proteins.  相似文献   

20.
Although not the sole feature responsible, the packing of amino acid side chains in the interior of proteins is known to contribute to protein conformational specificity. While a number of amphipathic peptide sequences with optimized hydrophobic domains has been designed to fold into a desired aggregation state, the contribution of the amino acids located on the hydrophilic side of such peptides to the final packing has not been investigated thoroughly. A set of self-aggregating 18-mer peptides designed previously to adopt a high level of alpha-helical conformation in benign buffer is used here to evaluate the effect of the nature of the amino acids located on the hydrophilic face on the packing of a four alpha-helical bundle. These peptides differ from one another by only one to four amino acid mutations on the hydrophilic face of the helix and share the same hydrophobic core. The secondary and tertiary structures in the presence or absence of denaturants were determined by circular dichroism in the far- and near-UV regions, fluorescence and nuclear magnetic resonance spectroscopy. Significant differences in folding ability, as well as chemical and thermal stabilities, were found between the peptides studied. In particular, surface salt bridges may form which would increase both the stability and extent of the tertiary structure of the peptides. The structural behavior of the peptides may be related to their ability to catalyze the decarboxylation of oxaloacetate, with peptides that have a well-defined tertiary structure acting as true catalysts.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号