首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Several studies have highlighted the leading role of the sequence periodicity of polar and nonpolar amino acids (binary patterns) in the formation of regular secondary structures (RSS). However, these were based on the analysis of only a few simple cases, with no direct mean to correlate binary patterns with the limits of RSS. Here, HCA‐derived hydrophobic clusters (HC) which are conditioned binary patterns whose positions fit well those of RSS, were considered. All the HC types, defined by unique binary patterns, which were commonly observed in three‐dimensional (3D) structures of globular domains, were analyzed. The 180 HC types with preferences for either α‐helices or β‐strands distinctly contain basic binary units typical of these RSS. Therefore a general trend supporting the “binary pattern preference” assumption was observed. HC for which observed RSS are in disagreement with their expected behavior (discordant HC) were also examined. They were separated in HC types with moderate preferences for RSS, having “weak” binary patterns and versatile RSS and HC types with high preferences for RSS, having “strong” binary patterns and then displaying nonpolar amino acids at the protein surface. It was shown that in both cases, discordant HC could be distinguished from concordant ones by well‐differentiated amino acid compositions. The obtained results could, thus, help to complement the currently available methods for the accurate prediction of secondary structures in proteins from the only information of a single amino acid sequence. This can be especially useful for characterizing orphan sequences and for assisting protein engineering and design. Proteins 2016; 84:624–638. © 2016 Wiley Periodicals, Inc.  相似文献   

2.
The Automated Protein Structure Analysis (APSA) method, which describes the protein backbone as a smooth line in three‐dimensional space and characterizes it by curvature κ and torsion τ as a function of arc length s, was applied on 77 proteins to determine all secondary structural units via specific κ(s) and τ(s) patterns. A total of 533 α‐helices and 644 β‐strands were recognized by APSA, whereas DSSP gives 536 and 651 units, respectively. Kinks and distortions were quantified and the boundaries (entry and exit) of secondary structures were classified. Similarity between proteins can be easily quantified using APSA, as was demonstrated for the roll architecture of proteins ubiquitin and spinach ferridoxin. A twenty‐by‐twenty comparison of all α domains showed that the curvature‐torsion patterns generated by APSA provide an accurate and meaningful similarity measurement for secondary, super secondary, and tertiary protein structure. APSA is shown to accurately reflect the conformation of the backbone effectively reducing three‐dimensional structure information to two‐dimensional representations that are easy to interpret and understand. Proteins 2009. © 2008 Wiley‐Liss, Inc.  相似文献   

3.
An algorithm is described for automatically detecting hydrophobic cores in proteins of known structure. Three pieces of information are considered in order to achieve this goal. These are: secondary structure, side-chain accessibility, and side-chain-side-chain contacts. Residues are considered to contribute to a core when they occur in regular secondary structure and have buried side chains that form predominantly nonpolar contacts with one another. This paper describes the algorithm's application to families of proteins with conserved topologies but low sequence similarities. The aim of this investigation is to determine the efficacy of the algorithm as well as to study the extent to which similar cores are identified within a common topology.  相似文献   

4.
The sweetness-suppressing polypeptide gurmarin isolated from Gymnema sylvestre consists of 35 amino acid residues and contains three intramolecular disulfide bonds. Nuclear magnetic resonance analysis showed that the hydrophobic side chains of Tyr-13, Tyr-14, Trp-28, and Trp-29 in gurmarin are oriented outwardly. Together with the hydrophobic side chains of Leu-9, Ile-11, and Pro-12, they form a hydrophobic cluster, and therefore these hydrophobic groups are assumed to act as the site for interaction with the receptor protein. To examine the roles of these hydrophobic amino acids, they were replaced by Gly. The resulting [Gly13,14,28,29]gurmarin and [Gly9,11,13,14,28,29]gurmarin did not suppress the responses to sucrose, glucose, fructose, or Gly. This result strongly suggests that these hydrophobic amino acids are involved in the interaction with the receptor protein. © 1998 John Wiley & Sons, Inc. Biopoly 45: 231–238, 1998  相似文献   

5.
There are numerous examples of convergent evolution in nature. Major ecological adaptations such as flight, loss of limbs in vertebrates, pesticide resistance, adaptation to a parasitic way of life, etc., have all evolved more than once, as seen by their analogous functions in separate taxa. But what about protein evolution? Does the environment have a strong enough influence on intracellular processes that enzymes and other functional proteins play, to evolve similar functional roles separately in different organisms? Manganese Superoxide Dismutase (MnSOD) is a manganesedependant metallo-enzyme which plays a crucial role in protecting cells from anti-oxidative stress by eliminating reactive (superoxide) oxygen species. It is a ubiquitous housekeeping enzyme found in nearly all organisms. In this study we compare phylogenies based on MnSOD protein sequences to those based on scores from Hydrophobic Cluster Analysis (HCA). We calculated HCA similarity values for each pair of taxa to obtain a pair-wise distance matrix. A UPGMA tree based on the HCA distance matrix and a common tree based on the primary protein sequence for MnSOD was constructed. Differences between these two trees within animals, enterobacteriaceae, planctomycetes and cyanobacteria are presented and cited as possible examples of convergence. We note that several residue changes result in changes in hydrophobicity at positions which apparently are under the effect of positive selection.  相似文献   

6.
The correlation between the primary and secondary structures of proteins was analysed using a large data set from the Protein Data Bank. Clear preferences of amino acids towards certain secondary structures classify amino acids into four groups: α-helix preferrers, strand preferrers, turn and bend preferrers, and His and Cys (the latter two amino acids show no clear preference for any secondary structure). Amino acids in the same group have similar structural characteristics at their Cβ and Cγ atoms that predicts their preference for a particular secondary structure. All α-helix preferrers have neither polar heteroatoms on Cβ and Cγ atoms, nor branching or aromatic group on the Cβ atom. All strand preferrers have aromatic groups or branching groups on the Cβ atom. All turn and bend preferrers have a polar heteroatom on the Cβ or Cγ atoms or do not have a Cβ atom at all. These new rules could be helpful in making predictions about non-natural amino acids.
Snežana D. ZarićEmail:
  相似文献   

7.
We have used cluster analysis to identify recurring sequence patterns that transcend protein family boundaries. A subset of these patterns occur predominantly in a single type of local structure in proteins. Here we characterize the three-dimensional structures and contexts in which these sequence patterns occur, with particular attention to the interactions responsible for their structural selectivity.  相似文献   

8.
Baussand J  Deremble C  Carbone A 《Proteins》2007,67(3):695-708
Several studies on large and small families of proteins proved in a general manner that hydrophobic amino acids are globally conserved even if they are subjected to high rate substitution. Statistical analysis of amino acids evolution within blocks of hydrophobic amino acids detected in sequences suggests their usage as a basic structural pattern to align pairs of proteins of less than 25% sequence identity, with no need of knowing their 3D structure. The authors present a new global alignment method and an automatic tool for Proteins with HYdrophobic Blocks ALignment (PHYBAL) based on the combinatorics of overlapping hydrophobic blocks. Two substitution matrices modeling a different selective pressure inside and outside hydrophobic blocks are constructed, the Inside Hydrophobic Blocks Matrix and the Outside Hydrophobic Blocks Matrix, and a 4D space of gap values is explored. PHYBAL performance is evaluated against Needleman and Wunsch algorithm run with Blosum 30, Blosum 45, Blosum 62, Gonnet, HSDM, PAM250, Johnson and Remote Homo matrices. PHYBAL behavior is analyzed on eight randomly selected pairs of proteins of >30% sequence identity that cover a large spectrum of structural properties. It is also validated on two large datasets, the 127 pairs of the Domingues dataset with >30% sequence identity, and 181 pairs issued from BAliBASE 2.0 and ranked by percentage of identity from 7 to 25%. Results confirm the importance of considering substitution matrices modeling hydrophobic contexts and a 4D space of gap values in aligning distantly related proteins. Two new notions of local and global stability are defined to assess the robustness of an alignment algorithm and the accuracy of PHYBAL. A new notion, the SAD-coefficient, to assess the difficulty of structural alignment is also introduced. PHYBAL has been compared with Hydrophobic Cluster Analysis and HMMSUM methods.  相似文献   

9.
Loops are regions of nonrepetitive conformation connecting regular secondary structures. We identified 2,024 loops of one to eight residues in length, with acceptable main-chain bond lengths and peptide bond angles, from a database of 223 protein and protein-domain structures. Each loop is characterized by its sequence, main-chain conformation, and relative disposition of its bounding secondary structures as described by the separation between the tips of their axes and the angle between them. Loops, grouped according to their length and type of their bounding secondary structures, were superposed and clustered into 161 conformational classes, corresponding to 63% of all loops. Of these, 109 (51% of the loops) were populated by at least four nonhomologous loops or four loops sharing a low sequence identity. Another 52 classes, including 12% of the loops, were populated by at least three loops of low sequence similarity from three or fewer nonhomologous groups. Loop class suprafamilies resulting from variations in the termini of secondary structures are discussed in this article. Most previously described loop conformations were found among the classes. New classes included a 2:4 type IV hairpin, a helix-capping loop, and a loop that mediates dinucleotide-binding. The relative disposition of bounding secondary structures varies among loop classes, with some classes such as beta-hairpins being very restrictive. For each class, sequence preferences as key residues were identified; those most frequently at these conserved positions than in proteins were Gly, Asp, Pro, Phe, and Cys. Most of these residues are involved in stabilizing loop conformation, often through a positive phi conformation or secondary structure capping. Identification of helix-capping residues and beta-breakers among the highly conserved positions supported our decision to group loops according to their bounding secondary structures. Several of the identified loop classes were associated with specific functions, and all of the member loops had the same function; key residues were conserved for this purpose, as is the case for the parvalbumin-like calcium-binding loops. A significant number, but not all, of the member loops of other loop classes had the same function, as is the case for the helix-turn-helix DNA-binding loops. This article provides a systematic and coherent conformational classification of loops, covering a broad range of lengths and all four combinations of bounding secondary structure types, and supplies a useful basis for modelling of loop conformations where the bounding secondary structures are known or reliably predicted.  相似文献   

10.
We present an analysis of intron positions in relation to nucleotides, amino acid residues, and protein secondary structure. Previous work has shown that intron sites in proteins are not randomly distributed with respect to secondary structures. Here we show that this preference can be almost totally explained by the nucleotide bias of splice site machinery, and may well not relate to protein stability or conformation at all. Each intron phase is preferentially associated with its own set of residues: phase 0 introns with lysine, glutamine, and glutamic acid before the intron, and valine after; phase 1 introns with glycine, alanine, valine, aspartic acid, and glutamic acid; and phase 2 introns with arginine, serine, lysine, and tryptophan. These preferences can be explained principally on the basis of nucleotide bias at intron locations, which is in accordance with previous literature. Although this work does not prove that introns are inserted into genomes at specific proto-splice sites, it shows that the nucleotide bias surrounding introns, however it originally occurred, explains the observed correlations between introns and protein secondary structure.  相似文献   

11.
The conformational parametersP k for each amino acid species (j=1–20) of sequential peptides in proteins are presented as the product ofP i,k , wherei is the number of the sequential residues in thekth conformational state (k=-helix,-sheet,-turn, or unordered structure). Since the average parameter for ann-residue segment is related to the average probability of finding the segment in the kth state, it becomes a geometric mean of (P k )av=(P i,k ) 1/n with amino acid residuei increasing from 1 ton. We then used ln(Pk)av to convert a multiplicative process to a summation, i.e., ln(P k ) av =(1/n)P i,k (i=1 ton) for ease of operation. However, this is unlike the popular Chou-Fasman algorithm, which has the flaw of using the arithmetic mean for relative probabilities. The Chou-Fasman algorithm happens to be close to our calculations in many cases mainly because the difference between theirP k and our InP k is nearly constant for about one-half of the 20 amino acids. When stronger conformation formers and breakers exist, the difference become larger and the prediction at the N- and C-terminal-helix or-sheet could differ. If the average conformational parameters of the overlapping segments of any two states are too close for a unique solution, our calculations could lead to a different prediction.  相似文献   

12.
Standard secondary structure elements such as α-helices or β-sheets, are characterized by repeating backbone torsion angles (φ,ψ) at the single residue level. Two-residue motifs of the type (φ,ψ)2 are also observed in nonlinear conformations, mainly turns. Taking these observations a step further, it can be argued that there is no a priori reason why the presence of higher order periodicities can not be envisioned in protein structures, such as, for example, periodic transitions between successive residues of the type (…-α-β-α-β-α-…), or (…-β-αL-β-αL-β-…), or (…-α-β-αL-α-β-αL-…), and so forth, where the symbols (α,β,αL) refer to the established Ramachandran-based residue conformations. From all such possible higher order periodicities, here we examine the deposited (with the PDB) protein structures for the presence of short-range periodical conformations comprising five consecutive residues alternating between two (and only two) distinct Ramachandran regions, for example, conformations of the type (α-β-α-β-α) or (β-αL-β-αL-β), and so forth. Using a probabilistic approach, we have located several thousands of such peptapeptides, and these were clustered and analyzed in terms of their structural characteristics, their sequences, and their putative functional correlations using a gene ontology-based approach. We show that such nonstandard short-range periodicities are present in a large and functionally diverse sample of proteins, and can be grouped into two structurally conserved major types. Examination of the structural context in which these peptapeptides are observed gave no conclusive evidence for the presence of a persistent structural or functional role of these higher order periodic conformations.  相似文献   

13.
Prediction of secondary structures in nucleic acids requires both an adequate physical model and powerful calculation algorithms. In our approach, we cut the molecules in sections of which the contributions to the global energy are context-dependent but roughly additive. The structure of minimum energy is obtained by a tree search under constraints of binary incompatibilities. Our algorithm of the "incompatibility islets" is shown to be more powerful than the "bit parallel forward checking" algorithm, well known in Artificial Intelligence. Recurrent algorithms, proposed by other authors are even more rapid, but often miss the correct structures, for they demand a strict additivity of the energetic contributions, physically unjustified. New strategies, required to deal with molecules of more than 200 nucleotides are discussed. Our physical model has been improved by considering the special case of internal loops beginning with a G-A opposition. A bonus of 1.5 kcal. is attributed to such a feature, at each side of an internal loop. To illustrate our programs, we give the computed schemes for the 3' termini of the small subunit ribosomal RNA.  相似文献   

14.
Fuzzy cluster analysis has been applied to the 20 amino acids by using 65 physicochemical properties as a basis for classification. The clustering products, the fuzzy sets (i.e., classical sets with associated membership functions), have provided a new measure of amino acid similarities for use in protein folding studies. This work demonstrates that fuzzy sets of simple molecular attributes, when assigned to amino acid residues in a protein''s sequence, can predict the secondary structure of the sequence with reasonable accuracy. An approach is presented for discriminating standard folding states, using near-optimum information splitting in half-overlapping segments of the sequence of assigned membership functions. The method is applied to a nonredundant set of 252 proteins and yields approximately 73% matching for correctly predicted and correctly rejected residues with approximately 60% overall success rate for the correctly recognized ones in three folding states: alpha-helix, beta-strand, and coil. The most useful attributes for discriminating these states appear to be related to size, polarity, and thermodynamic factors. Van der Waals volume, apparent average thickness of surrounding molecular free volume, and a measure of dimensionless surface electron density can explain approximately 95% of prediction results. hydrogen bonding and hydrophobicity induces do not yet enable clear clustering and prediction.  相似文献   

15.
Silva PJ 《Proteins》2008,70(4):1588-1594
Hydrophobic cluster analysis (HCA) has long been used as a tool to detect distant homologies between protein sequences, and to classify them into different folds. However, it relies on expert human intervention, and is sensitive to subjective interpretations of pattern similarities. In this study, we describe a novel algorithm to assess the similarity of hydrophobic amino acid distributions between two sequences. Our algorithm correctly identifies as misattributions several HCA-based proposals of structural similarity between unrelated proteins present in the literature. We have also used this method to identify the proper fold of a large variety of sequences, and to automatically select the most appropriate structure for homology modeling of several proteins with low sequence identity to any other member of the protein data bank. Automatic modeling of the target proteins based on these templates yielded structures with TM-scores (vs. experimental structures) above 0.60, even without further refinement. Besides enabling a reliable identification of the correct fold of an unknown sequence and the choice of suitable templates, our algorithm also shows that whereas most structural classes of proteins are very homogeneous in hydrophobic cluster composition, a tenth of the described families are compatible with a large variety of hydrophobic patterns. We have built a browsable database of every major representative hydrophobic cluster pattern present in each structural class of proteins, freely available at http://www2.ufp.pt/ pedros/HCA_db/index.htm.  相似文献   

16.
17.
Analysis of the far-ultraviolet circular dichroism spectrum of bovine blood coagulation factor IX reveals the presence of approximately 14% helical structures 26% -sheets, 20% -turns, and 40% coils. These values are essentially the same for the activation products of this zymogen, factor IXa and factor IXa. Similar analysis for bovine factor X permits calculation of these secondary structural as approximately 11% helices, 31% -structures, 22% -turns, and 36% random structures. Bovine prothrombin contains approximately 12% helical structures, 35% -structures, 24% -turns, and 29% coils. None of these values is substantially altered as a result of increase of thepH from 7.4 to 10.5, or upon addition of Ca2+ to a concentration of at least 20 mM. Analysis of the near-ultraviolet spectra of factor IX and prothrombin suggests that several aromatic amino acid residues and the disulfide bond present in their -carboxyglutamic acid-containing regions are exposed to solvent and are perturbed by the abovepH adjustment and Ca2+ addition. Similar effects are observed in the case of factor X; in addition, the Trp residue at the amino terminus of the heavy chain appears to be influenced by the abovepH alteration. The results reported in this paper show that these vitamin K-dependent blood coagulation proteins are similar in their ordered secondary structures, which are dominated by -sheets and -turns. Their overall secondary structures are not influenced by Ca2+ binding and are stable to alkalinepH changes. However, these same environmental alterations appear to be effective probes of aromatic residues in the -carboxyglutamic acid regions.  相似文献   

18.
We report a novel computational procedure for determining protein native topology, or fold, by defining loop connectivity based on skeletons of secondary structures that can usually be obtained from low to intermediate-resolution density maps. The procedure primarily involves a knowledge-based geometry filter followed by an energetics-based evaluation. It was tested on a large set of skeletons covering a wide range of protein architecture, including one modeled from an experimentally determined 7.6A cryo-electron microscopy (cryo-EM) density map. The results showed that the new procedure could effectively deduce protein folds without high-resolution structural data, a feature that could also be used to recognize native fold in structure prediction and to interpret data in fields like structure genomics. Most importantly, in the energetics-based evaluation, it was revealed that, despite the inevitable errors in the artificially constructed structures and limited accuracy of knowledge-based potential functions, the average energy of an ensemble of structures with slightly different configurations around the native skeleton is a much more robust parameter for marking native topology than the energy of individual structures in the ensemble. This result implies that, among all the possible topology candidates for a given skeleton, evolution has selected the native topology as the one that can accommodate the largest structural variations, not the one rigidly trapped in a deep, but narrow, conformational energy well.  相似文献   

19.
A series of Ala vs. Gly mutations at different helical and nonhelical positions of the chemotactic protein CheY, from E. coli, has been made. We have used this information to fit a general analytical equation that describes the free energy changes of an Ala to Gly mutation within ±0.45 kcal mol?1 with 95% confidence. The equation includes three terms: (1) the change in solvent-accessible hydrophobic surface area, corrected for the possible closure of the cavity left by deleting the Cβ of the Ala; (2) the change in hydrophilic area of the nonintramolecularly hydrogen-bonded groups; and (3) the dihedral angles of the position being mutated. This last term extends the calculation to any conformation, not only α-helices. The general applicability of the equation for Ala vs. Gly mutations, when Ala or a small solvent-exposed polar residue is the wild-type residue, has been tested using data from other proteins: barnase, CI2 trypsin inhibitor, T4 lysozyme, and Staphylococcus nuclease. The predictive power of this simple approach offers the possibility of extending it to more complex mutations. © 1995 Wiley-Liss, Inc.  相似文献   

20.
Background: Second internal transcribed spacer (ITS2) has proven to contain useful biological information at higher taxonomic levels. Objectives: This study was carried out to unravel the biological information in the ITS2 region of An. culicifacies and the internal relationships between the five species of Anopheles culicifacies. Methodology: In achieving these objectives, twenty two ITS2 sequences (~370bp) of An. culicifacies species were retrieved from GenBank and secondary structures were generated. For the refinement of the primary structures, i.e. nucleotide sequence of ITS2 sequences, generated secondary structures were used. The improved ITS2 primary structures sequences were then aligned and used for the construction of phylogenetic trees. Results and discussions: ITS2 secondary structures of culicifacies closely resembled near universal eukaryotes secondary structure and had three helices, and the structures of helix II and distal region of helix III of ITS2 of An. culicifacies were strikingly similar to those regions of other organisms strengthening possible involvement of these regions in rRNA biogenesis. Phylogenetic analysis of improved ITS2 sequences revealed two main clades one representing sibling B, C and E and A and D in the other. Conclusions: Near sequence identity of ITS2 regions of the members in a particular clade indicate that this region is undergoing parallel evolution to perform clade specific RNA biogenesis. The divergence of certain isolates of An. culicifacies from main clades in phylogenetic analyses suggests the possible existence of camouflaged sub-species within the complex of culicifacies. Using the fixed nucleotide differences, we estimate that these two clades have diverged nearly 3.3 million years ago, while the sibling species in clade 2 are under less evolutionary pressure, which may have evolved much later than the members in clade 1.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号