首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The nucleotide frequencies in the second codon positions of genes are remarkably different for the coding regions that correspond to different secondary structures in the encoded proteins, namely, helix, beta-strand and aperiodic structures. Indeed, hydrophobic and hydrophilic amino acids are encoded by codons having U or A, respectively, in their second position. Moreover, the beta-strand structure is strongly hydrophobic, while aperiodic structures contain more hydrophilic amino acids. The relationship between nucleotide frequencies and protein secondary structures is associated not only with the physico-chemical properties of these structures but also with the organisation of the genetic code. In fact, this organisation seems to have evolved so as to preserve the secondary structures of proteins by preventing deleterious amino acid substitutions that could modify the physico-chemical properties required for an optimal structure.  相似文献   

2.
Wang J  Feng JA 《Protein engineering》2003,16(11):799-807
This paper reports an extensive sequence analysis of the alpha-helices of proteins. alpha-Helices were extracted from the Protein Data Bank (PDB) and were divided into groups according to their sizes. It was found that some amino acids had differential propensity values for adopting helical conformation in short, medium and long alpha-helices. Pro and Trp had a significantly higher propensity for helical conformation in short helices than in medium and long helices. Trp was the strongest helix conformer in short helices. Sequence patterns favoring helical conformation were derived from a neighbor-dependent sequence analysis of proteins, which calculated the effect of neighboring amino acid type on the propensity of residues for adopting a particular secondary structure in proteins. This method produced an enhanced statistical significance scale that allowed us to explore the positional preference of amino acids for alpha-helical conformations. It was shown that the amino acid pair preference for alpha-helix had a unique pattern and this pattern was not always predictable by assuming proportional contributions from the individual propensity values of the amino acids. Our analysis also yielded a series of amino acid dyads that showed preference for alpha-helix conformation. The data presented in this study, along with our previous study on loop sequences of proteins, should prove useful for developing potential 'codes' for recognizing sequence patterns that are favorable for specific secondary structural elements in proteins.  相似文献   

3.
Mooney SD  Liang MH  DeConde R  Altman RB 《Proteins》2005,61(4):741-747
A primary challenge for structural genomics is the automated functional characterization of protein structures. We have developed a sequence-independent method called S-BLEST (Structure-Based Local Environment Search Tool) for the annotation of previously uncharacterized protein structures. S-BLEST encodes the local environment of an amino acid as a vector of structural property values. It has been applied to all amino acids in a nonredundant database of protein structures to generate a searchable structural resource. Given a query amino acid from an experimentally determined or modeled structure, S-BLEST quickly identifies similar amino acid environments using a K-nearest neighbor search. In addition, the method gives an estimation of the statistical significance of each result. We validated S-BLEST on X-ray crystal structures from the ASTRAL 40 nonredundant dataset. We then applied it to 86 crystallographically determined proteins in the protein data bank (PDB) with unknown function and with no significant sequence neighbors in the PDB. S-BLEST was able to associate 20 proteins with at least one local structural neighbor and identify the amino acid environments that are most similar between those neighbors.  相似文献   

4.
About 6000 contact regions (patches) of helix-to-helix packing from 300 well-resolved non-homologous protein structures were considered. The patches were defined by the spatial helical neighbors and were estimated in atomic detail using a variable distance criterion. The following questions are addressed. (1) Are the amino acid preferences and atomic composition of distinct types of helical patches indicative for the type of their neighbor? Distributions of size, atomic composition and packing density are compared for different types of helical interfaces. Thereby contact preferences are derived for parts of secondary structures adjoining each other or pointing towards the solvent. (2) Is it possible to cluster helical patches according to their structural similarity? For these purposes the patches were classified with an automatic sequence-independent superposition procedure which yields a distinctively reduced set of representative interfaces. On this basis, the methodology for finding exchangeable patches in different proteins is demonstrated.  相似文献   

5.
Crasto CJ  Feng J 《Proteins》2001,42(3):399-413
We performed an extensive sequence analysis on the loops of proteins. By dividing a loop databank derived from the Protein Data Bank into groups, we analyzed the chemical characteristics and the sequence preferences of loops of different lengths and loops connecting different secondary structures in proteins. We found that a large population of loops in our loop databank (94.4%) is either partially or completely surface-exposed. A majority of surface loops in proteins are hydrophilic, whereas the chemical characteristics of interior loops are relatively neutral according to Eisenberg's consensus hydrophobicity scale. As a first step in investigating the intrinsic sequence-structure relationship of loop sequences in proteins, we performed a neighbor-dependent sequence analysis that calculated the effect of the neighboring amino acid type on the loop propensity of residues in loops. This method enhances the statistical significance of residue propensity, thus allowing us to explore the positional preference of amino acids in loops. Our analysis yielded a series of amino acid dyads that showed high preference for loop conformation. The data presented in this study should prove useful for developing potential codes in recognizing loop sequences in proteins.  相似文献   

6.
We explore the question of whether local effects (originating from the amino acids intrinsic secondary structure propensities) or nonlocal effects (reflecting the sequence of amino acids as a whole) play a larger role in determining the fold of globular proteins. Earlier circular dichroism studies have shown that the pattern of polar, non polar amino acids (nonlocal effect) dominates over the amino acid intrinsic propensity (local effect) in determining the secondary structure of oligomeric peptides. In this article, we present a coarse grained computational model that allows us to quantitatively estimate the role of local and nonlocal factors in determining both the secondary and tertiary structure of small, globular proteins. The amino acid intrinsic secondary structure propensity is modeled by a dihedral potential term. This dihedral potential is parametrized to match with experimental measurements of secondary structure propensity. Similarly, the magnitude of the attraction between hydrophobic residues is parametrized to match the experimental transfer free energies of hydrophobic amino acids. Under these parametrization conditions, we systematically explore the degree of frustration a given polar, non polar pattern can tolerate when the secondary structure intrinsic propensities are in opposition to it. When the parameters are in the biophysically relevant range, we observe that the fold of small, globular proteins is determined by the pattern of polar, non polar amino acids regardless of their instrinsic secondary structure propensities. Our simulations shed new light on previous observations that tertiary interactions are more influential in determining protein structure than secondary structure propensity. The fact that this can be inferred using a simple polymer model that lacks most of the biochemical details points to the fundamental importance of binary patterning in governing folding.  相似文献   

7.
Understanding the key factors that influence the interaction preferences of amino acids in the folding of proteins have remained a challenge. Here we present a knowledge‐based approach for determining the effective interactions between amino acids based on amino acid type, their secondary structure, and the contact based environment that they find themselves in the native state structure as measured by their number of neighbors. We find that the optimal information is approximately encoded in a 60 × 60 matrix describing the 20 types of amino acids in three distinct secondary structures (helix, beta strand, and loop). We carry out a clustering scheme to understand the similarity between these interactions and to elucidate a nonredundant set. We demonstrate that the inferred energy parameters can be used for assessing the fit of a given sequence into a putative native state structure.  相似文献   

8.
The catalytic activity of glycerol kinase (EC 2.7.1.30, ATP:glycerol 3-phosphotransferase) from Escherichia coli is inhibited allosterically by IIA(Glc) (previously known as III(Glc)), the glucose-specific phosphocarrier protein of the phosphoenolpyruvate:glycose phosphotransferase system. A sequentially contiguous portion of glycerol kinase undergoes an induced fit conformational change involving coil, alpha-helix, and 3(10)-helix upon IIA(Glc) binding. A second induced fit occurs upon binding of Zn(II) to a novel intermolecular site, which increases complex stability by cation-promoted association. Eight of the ten sequentially contiguous amino acids are substituted with alanine to evaluate the roles of these positions in complex formation. Effects of the substitutions reveal both favorable and antagonistic contributions of the normal amino acids to complex formation, and Zn(II) reverses these contributions for two of the amino acids. The consequences of some of the substitutions for IIA(Glc) inhibition are consistent with changes in the intermolecular interactions seen in the crystal structures. However, for the amino acids that are located in the region that is alpha-helical in the absence of IIA(Glc), the effects of the substitutions are not consistent with changes in intermolecular interactions but with increased stability of the alpha-helical region due to the higher alpha-helix propensity of alanine. The reduced affinity for IIA(Glc) binding seen for these variants is consistent with predictions of Freire and co-workers [Luque, I., and Freire, E. (2000) Proteins: Struct., Funct., Genet. 4, 63-71]. These variants show also increased cation-promoted association by Zn(II) so that the energetic contribution of Zn(II) to complex formation is doubled. The similarity of effects of the alanine substitutions of the amino acids in the alpha-helical region for IIA(Glc) binding affinity and cation-promoted association by Zn(II) indicates that they function as a cooperative unit.  相似文献   

9.
A suite of FORTRAN programs, PREF, is described for calculating preference functions from the data base of known protein structures and for comparing smoothed profiles of sequence-dependent preferences in proteins of unknown structure. Amino acid preferences for a secondary structure are considered as functions of a sequence environment. Sequence environment of amino acid residue in a protein is defined as an average over some physical, chemical, or statistical property of its primary structure neighbors. The frequency distribution of sequence environments in the data base of soluble protein structures is approximately normal for each amino acid type of known secondary conformation. An analytical expression for the dependence of preferences on sequence environment is obtained after each frequency distribution is replaced by corresponding Gaussian function. The preference for the α-helical conformation increases for each amino acid type with the increase of sequence environment of buried solvent-accessible surface areas. We show that a set of preference functions based on buried surface area is useful for predicting folding motifs in α-class proteins and in integral membrane proteins. The prediction accuracy for helical residues is 79% for 5 integral membrane proteins and 74% for 11 α-class soluble proteins. Most residues found in transmembrane segments of membrane proteins with known α-helical structure are predicted to be indeed in the helical conformation because of very high middle helix preferences. Both extramembrane and transmembrane helices in the photosynthetic reaction center M and L subunits are correctly predicted. We point out in the discussion that our method of conformational preference functions can identify what physical properties of the amino acids are important in the formation of particular secondary structure elements. © 1993 John Wiley & Sons, Inc.  相似文献   

10.
Dwyer DS 《Proteins》2006,63(4):939-948
The electronic properties of amino acid side-chains are emerging as an important factor in the preference for secondary structure in proteins. These properties have not been fully characterized, nor has their role in the behavior of peptides been explored in any detail. The present studies sought to evaluate several possibilities: 1) that hydrophilicity can be expressed solely in electronic terms, 2) that substituent effects of side-chains extend across the peptide bond, and (3) nearest-neighbor effects in dipeptides correlate with secondary structural preferences. Quantum mechanics (QM) calculations were used to define the electronic properties of individual amino acids and dipeptides. It was found that the hydrophilicity of an amino acid side-chain can be accurately represented as a function of the electron densities of its component atoms. In addition, the nature of an amino acid in the second position of a dipeptide affects the electronic properties (Mulliken populations and electron densities) of the main-chain atoms of the first residue. Certain electronic features of the dipeptides strongly correlated with propensity for secondary structure. Specifically, Mulliken population data at the Calpha atom and N atom predicted preference for alpha-helices versus coil and strand conformations, respectively. Analysis of dipeptides arrayed in either helical or extended structures revealed lengthening of main-chain bonds in the alpha-helical conformations. A thorough characterization of the electronic properties of amino acids and short peptide segments may provide a better understanding of the forces that determine secondary structure in proteins.  相似文献   

11.
The cystine-rich antifreeze polypeptides (AFP) from sea raven were fractionated by reverse-phase high performance liquid chromatography into several components, with SR2 (Mr 17,000) as the major AFP. Sea raven AFP cDNA clones were isolated from a liver cDNA library using a synthetic oligonucleotide, and the identity of one of the clones, C2-1, was confirmed by hybridization selection and cell-free translation. C2-1 encodes a pre-AFP of 195 amino acids with no evidence of any profragments. Comparison of the deduced amino acid sequence with partial peptide sequences from SR2 showed substitutions in at least four amino acid positions, suggesting that C2-1 cDNA codes for a minor component. Both the primary and the predicted secondary structures of sea raven AFP are completely different from those of other fish AFP. This further confirms that sea raven AFP belongs to a different class of antifreezes. The high frequency of reverse turns and the presence of paired hydrophilic amino acids in these structures are striking features of the protein and may contribute to their antifreeze action.  相似文献   

12.
13.
Summary We examine in this paper one of the expected consequences of the hypothesis that modern proteins evolved from random heteropeptide sequences. Specifically, we investigate the lengthwise distributions of amino acids in a set of 1,789 protein sequences with little sequence identity using the run test statistic (r o) of Mood (1940,Ann. Math. Stat. 11, 367–392). The probability density ofr o for a collection of random sequences has mean=0 and variance=1 [the N(0,1) distribution] and can be used to measure the tendency of amino acids of a given type to cluster together in a sequence relative to that of a random sequence. We implement the run test using binary representations of protein sequences in which the amino acids of interest are assigned a value of 1 and all others a value of 0. We consider individual amino acids and sets of various combinations of them based upon hydrophobicity (4 sets), charge (3 sets), volume (4 sets), and secondary structure propensity (3 sets). We find that any sequence chosen randomly has a 90% or greater chance of having a lengthwise distribution of amino acids that is indistinguishable from the random expectation regardless of amino acid type. We regard this as strong support for the random-origin hypothesis. However, we do observe significant deviations from the random expectation as might be expected after billions years of evolution. Two important global trends are found: (1) Amino acids with a strong α-helix propensity show a strong tendency to cluster whereas those with β-sheet or reverse-turn propensity do not. (2) Clustered rather than evenly distributed patterns tend to be preferred by the individual amino acids and this is particularly so for methionine. Finally, we consider the problem of reconciling the random nature of protein sequences with structurally meaningful periodic “patterns” that can be detected by sliding-window, autocorrelation, and Fourier analyses. Two examples, rhodopsin and bacteriorhodopsin, show that such patterns are a natural feature of random sequences.  相似文献   

14.
15.
The advent of full genome sequences provides exceptionally rich data sets to explore molecular and evolutionary mechanisms that shape divergence among and within genomes. In this study, we use multivariate analysis to determine the processes driving genome-wide patterns of amino usage in the obligate endosymbiont Buchnera and its close free-living relative Escherichia coli. In the AT-rich Buchnera genome, the primary source of variation in amino acid usage differentiates high- and low-expression genes. Amino acids of high-expression Buchnera genes are generally less aromatic and use relatively GC-rich codons, suggesting that selection against aromatic amino acids and against amino acids with AT-rich codons is stronger in high-expression genes. Selection to maintain hydrophobic amino acids in integral membrane proteins is a primary factor driving protein evolution in E. coli but is a secondary factor in Buchnera. In E. coli, gene expression is a secondary force driving amino acid usage, and a correlation with tRNA abundance suggests that translational selection contributes to this effect. Although this and previous studies demonstrate that AT mutational bias and genetic drift influence amino acid usage in Buchnera, this genome-wide analysis argues that selection is sufficient to affect the amino acid content of proteins with different expression and hydropathy levels.  相似文献   

16.
Structural database-derived propensities for amino acids to adopt particular local protein structures, such as alpha-helix and beta-strand, have long been recognized and effectively exploited for the prediction of protein secondary structure. However, the experimental verification of database-derived propensities using mutagenesis studies has been problematic, especially for beta-strand propensities, because local structural preferences are often confounded by non-local interactions arising from formation of the native tertiary structure. Thus, the overall thermodynamic stability of a protein is not always altered in a predictable manner by changes in local structural propensity at a single position. In this study, we have undertaken an investigation of the relationship between beta-strand propensity and protein folding kinetics. By characterizing the effects of a wide variety of amino acid substitutions at two different beta-strand positions in an SH3 domain, we have found that the observed changes in protein folding rates are very well correlated to beta-strand propensities for almost all of the substitutions examined. In contrast, there is little correlation between propensities and unfolding rates. These data indicate that beta-strand conformation is well formed in the structured portion of the SH3 domain transition state, and that local structure propensity strongly influences the stability of the transition state. Since the transition state is known to be packed more loosely than the native state and likely lacks many of the non-local stabilizing interactions seen in the native state, we suggest that folding kinetics studies may generally provide an effective means for the experimental validation of database-derived local structural propensities.  相似文献   

17.
Sense codons are found in specific contexts   总被引:27,自引:0,他引:27  
The sequence environment of codons in structural genes has been investigated statistically, using computer methods. A set of Escherichia coli genes with abundant products was compared with a set having low gene product levels, in order to detect potential differences associated with expression. The results show striking non-randomness in the nucleotides occurring near codons. These effects are, unexpectedly, very much larger and more homogeneous among the genes with rare products. The intensity of effects in weakly expressed genes suggests that such non-random sequence environments decrease expression. In the weakly expressed set of genes, the 5' neighbor of a codon, and all positions of the 3' neighbor codon are biased. In the highly expressed genes, the first nucleotide of the next codon is a uniquely affected site. The distribution of non-randomness in weakly expressed genes suggests that sequence bias is primarily due to a constraint acting directly on the secondary or tertiary structure of the codon/anticodon. In highly expressed genes, the observed bias suggests an interaction between the codon/anticodon and a site outside the codon/anticodon. Much of the tendency to non-random near-neighbor sequences in weakly expressed genes can be ascribed to a correlation between nearby nucleotides and the wobble nucleotide of the codon, despite the fact that selection of such correlations will alter the amino acid sequence. The favored pattern, in genes expressed at low level, is R YYR or Y RRY. R indicates purine, Y indicates pyrimidine; the space is the boundary between codons. It seems likely that this preference for nearby sequences is the physical basis of the genetic context effect. Under this assumption such sequence biases will affect expression. On this basis, we predict new sites for contextual mutations which decrease expression, and suggest strategy for the design of messages having optimal translational activity.  相似文献   

18.
Kcv is a 94-amino acid protein encoded by chlorella virus PBCV-1 that corresponds to the pore module of K(+) channels. Therefore, Kcv can be a model for studying the protein design of K(+) channel pores. We analyzed the molecular diversity generated by approximately 1 billion years of evolution on kcv genes isolated from 40 additional chlorella viruses. Because the channel is apparently required for virus replication, the Kcv variants are all functional and contain multiple and dispersed substitutions that represent a repertoire of allowed sets of amino acid substitutions (from 4 to 12 amino acids). Correlations between amino acid substitutions and the new properties displayed by these channels guided site-directed mutations that revealed synergistic amino acid interactions within the protein as well as previously unknown interactions between distant channel domains. The effects of these multiple changes were not predictable from a priori structural knowledge of the channel pore.  相似文献   

19.
Goliaei B  Minuchehr Z 《FEBS letters》2003,537(1-3):121-127
Amino acids seem to have specific preferences for various locations in alpha-helices. These specific preferences, called singlet local propensity (SLP), have been determined by calculating the preference of occurrence of each amino acid in different positions of the alpha-helix. We have studied the occurrence of amino acids, single or pairs, in different positions, singlet or doublet, of alpha-helices in a database of 343 non-homologous proteins representing a unique superfamily from the SCOP database with a resolution better than 2.5 A from the Protein Data Bank. The preference of single amino acids for various locations of the helix was shown by the relative entropy of each amino acid with respect to the background. Based on the total relative entropy of all amino acids occurring in a single position, the N(cap) position was found to be the most selective position in the alpha-helix. A rigorous statistical analysis of amino acid pair occurrences showed that there are exceptional pairs for which, the observed frequency of occurrence in various doublet positions of the alpha-helix is significantly different from the expected frequency of occurrence in that position. The doublet local propensity (DLP) was defined as the preference of occurrences of amino acid pairs in different doublet positions of the alpha-helix. For most amino acid pairs, the observed DLP (DLP(O)) was nearly equal to the expected DLP (DLP(E)), which is the product of the related SLPs. However, for exceptional pairs of amino acids identified above, the DLP(O) and DLP(E) values were significantly different. Based on the relative values of DLP(O) and DLP(E), exceptional amino acid pairs were divided into two categories. Those, for which the DLP(O) values are higher than DLP(E), should have a strong tendency to pair together in the specified position. For those pairs which the DLP(O) values are less than DLP(E), there exists a hindrance in neighboring of the two amino acids in that specific position of the alpha-helix. These cases have been identified and listed in various tables in this paper. The amount of mutual information carried by the exceptional pairs of amino acids was significantly higher than the average mutual information carried by other amino acid pairs. The average mutual information conveyed by amino acid pairs in each doublet position was found to be very small but non-zero.  相似文献   

20.
A large set of three-dimensional structures of 264 protein-protein complexes with known nonsynonymous single nucleotide polymorphisms (nsSNPs) at the interface was built using homology-based methods. The nsSNPs were mapped on the proteins' structures and their effect on the binding energy was investigated with CHARMM force field and continuum electrostatic calculations. Two sets of nsSNPs were studied: disease annotated Online Mendelian Inheritance in Man (OMIM) and nonannotated (non-OMIM). It was demonstrated that OMIM nsSNPs tend to destabilize the electrostatic component of the binding energy, in contrast with the effect of non-OMIM nsSNPs. In addition, it was shown that the change of the binding energy upon amino acid substitutions is not related to the conservation of the net charge, hydrophobicity, or hydrogen bond network at the interface. The results indicate that, generally, the effect of nsSNPs on protein-protein interactions cannot be predicted from amino acids' physico-chemical properties alone, since in many cases a substitution of a particular residue with another amino acid having completely different polarity or hydrophobicity had little effect on the binding energy. Analysis of sequence conservation showed that nsSNP at highly conserved positions resulted in a large variance of the binding energy changes. In contrast, amino acid substitutions corresponding to nsSNPs at nonconserved positions, on average, were not found to have a large effect on binding affinity. pKa calculations were performed and showed that amino acid substitutions could change the wild-type proton uptake/release and thus resulting in different pH-dependence of the binding energy.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号