首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A model for prediction of alpha-helical regions in amino acid sequences has been tested on the mainly-alpha protein structure class. The modeling represents the construction of a continuous hypothetical alpha-helical conformation for the whole protein chain, and was performed using molecular mechanics tools. The positive prediction of alpha-helical and non-alpha-helical pentapeptide fragments of the proteins is 79%. The model considers only local interactions in the polypeptide chain without the influence of the tertiary structure. It was shown that the local interaction defines the alpha-helical conformation for 85% of the native alpha-helical regions. The relative energy contributions to the energy of the model were analyzed with the finding that the van der Waals component determines the formation of alpha-helices. Hydrogen bonds remain at constant energy independently whether alpha-helix or non-alpha-helix occurs in the native protein, and do not determine the location of helical regions. In contrast to existing methods, this approach additionally permits the prediction of conformations of side chains. The model suggests the correct values for ~60% of all chi-angles of alpha-helical residues.  相似文献   

2.
The prediction of the secondary structure of proteins from their amino acid sequences remains a key component of many approaches to the protein folding problem. The most abundant form of regular secondary structure in proteins is the alpha-helix, in which specific residue preferences exist at the N-terminal locations. Propensities derived from these observed amino acid frequencies in the Protein Data Bank (PDB) database correlate well with experimental free energies measured for residues at different N-terminal positions in alanine-based peptides. We report a novel method to exploit this data to improve protein secondary structure prediction through identification of the correct N-terminal sequences in alpha-helices, based on existing popular methods for secondary structure prediction. With this algorithm, the number of correctly predicted alpha-helix start positions was improved from 30% to 38%, while the overall prediction accuracy (Q3) remained the same, using cross-validated testing. Although the algorithm was developed and tested on multiple sequence alignment-based secondary structure predictions, it was also able to improve the predictions of start locations by methods that use single sequences to make their predictions. Furthermore, the residue frequencies at N-terminal positions of the improved predictions better reflect those seen at the N-terminal positions of alpha-helices in proteins. This has implications for areas such as comparative modeling, where a more accurate prediction of the N-terminal regions of alpha-helices should benefit attempts to model adjacent loop regions. The algorithm is available as a Web tool, located at http://rocky.bms.umist.ac.uk/elephant.  相似文献   

3.
Secondary structure prediction from amino acid sequence is a key component of protein structure prediction, with current accuracy at approximately 75%. We analysed two state-of-the-art secondary structure prediction methods, PHD and JPRED, comparing predictions with secondary structure assigned by the algorithms DSSP and STRIDE. The specific focus of our study was alpha-helix N-termini, as empirical free energy scales are available for residue preferences at N-terminal positions. Although these prediction methods perform well in general at predicting the alpha-helical locations and length distributions in proteins, they perform less well at predicting the correct helical termini. For example, although most predicted alpha-helices overlap a real alpha-helix (with relatively few completely missed or extra predicted helices), only one-third of JPRED and PHD predictions correctly identify the N-terminus. Analysis of neighbouring N-terminal sequences to predicted helical N-termini shows that the correct N-terminus is often within one or two residues. More importantly, the true N-terminal motif is, on average, more favourable as judged by our experimentally measured free energies. This suggests a simple, but powerful, strategy to improve secondary structure prediction using empirically derived energies to adjust the predicted output to a more favourable N-terminal sequence.  相似文献   

4.
Amphiphilic alpha-helices play a major role in membrane dependent processes and are manifested in the primary structure of a protein by the periodic appearance of hydrophobic residues. Based on these periodic sequences, the hydrophobic moment was introduced, , which essentially treats the hydrophobicity of amino acid residues as a two-dimensional vector sum and provides a measure of amphiphilicity within regular repeat structures. To identify putative amphiphilic alpha-helix forming sequences, hydrophobic moment analysis assumes an amino acid residue periodicity of 100 and scans protein primary structures to find the 11-residue window with maximal . Taken with the window's mean hydrophobicity, , hydrophobic moment plot analysis uses the coordinate pair, [, ] to classify alpha-helices as either surface active, globular or transmembrane. More recently, this latter analysis has been extended to recognize candidate oblique orientated alpha-helices. Here, the hydrophobic moment is reviewed and data to query the logic of using a fixed window length and a fixed residue angular periodicity in hydrophobic moment analysis are provided. In addition, problems associated with the use of such analysis to predict alpha-helix structure/function relationships are considered.  相似文献   

5.
Detailed primary sequence and secondary structure analyses are reported for the hyaluronate binding region (G1 domain) and link protein of proteoglycan aggregates. These are based on six full or partial sequences from the chicken, pig, human, rat and bovine proteins. Determinations of a full pig and a partial human link protein sequence are reported in the Appendix. Five sequences at the N terminus in both proteins were compared with the structures of 11 variable immunoglobulin (Ig) fold domains for which crystal structures are available. Despite only modest sequence homology, a clear alignment could be proposed. Analysis of this shows that the equivalents of the first and second hypervariable segments are now significantly longer, and both proteins have N-terminal extensions that are up to 23 residues in length. Secondary structure predictions showed that these sequences could be identified with available crystal structures for the variable Ig fold. However the hydrophobic residues involved in interactions between the light and heavy chains in Igs are replaced by hydrophilic charged groups in both proteins. These results imply that both proteins are members of the Ig superfamily, but exhibit structural differences distinct from other members of this superfamily for which crystal structures are known. The proteoglycan tandem repeat (PTR) is a repeat of 99 residues that is found twice in the amino acid sequence of link protein and the proteoglycan G1 domain adjacent to the Ig fold, and also twice in the proteoglycan G2 domain. A total of 16 PTRs was available for analysis. Compositional analyses show that these are positively charged if these originate from link protein, and negatively charged if from the G1 or G2 domains. The 16 Robson secondary structure predictions for the PTRs were averaged to improve the statistics of the prediction, and checked by comparison with Chou-Fasman calculations. A strong alpha-helix prediction was found at residues 13 to 25, and several beta-strands were predicted. The overall content is 18% alpha-helix and 28% beta-sheet, with 44% of the remaining sequence being predicted as turns. These analyses show that both the proteoglycan G1 domain and link protein are constructed from two distinct globular components, which may provide the two functional roles of these proteins in proteoglycan aggregation.  相似文献   

6.
We present a new method for predicting the secondary structure of globular proteins based on non-linear neural network models. Network models learn from existing protein structures how to predict the secondary structure of local sequences of amino acids. The average success rate of our method on a testing set of proteins non-homologous with the corresponding training set was 64.3% on three types of secondary structure (alpha-helix, beta-sheet, and coil), with correlation coefficients of C alpha = 0.41, C beta = 0.31 and Ccoil = 0.41. These quality indices are all higher than those of previous methods. The prediction accuracy for the first 25 residues of the N-terminal sequence was significantly better. We conclude from computational experiments on real and artificial structures that no method based solely on local information in the protein sequence is likely to produce significantly better results for non-homologous proteins. The performance of our method of homologous proteins is much better than for non-homologous proteins, but is not as good as simply assuming that homologous sequences have identical structures.  相似文献   

7.
The prediction of protein secondary structure (alpha-helices, beta-sheets and coil) is improved by 9% to 66% using the information available from a family of homologous sequences. The approach is based both on averaging the Garnier et al. (1978) secondary structure propensities for aligned residues and on the observation that insertions and high sequence variability tend to occur in loop regions between secondary structures. Accordingly, an algorithm first aligns a family of sequences and a value for the extent of sequence conservation at each position is obtained. This value modifies a Garnier et al. prediction on the averaged sequence to yield the improved prediction. In addition, from the sequence conservation and the predicted secondary structure, many active site regions of enzymes can be located (26 out of 43) with limited over-prediction (8 extra). The entire algorithm is fully automatic and is applicable to all structural classes of globular proteins.  相似文献   

8.
L-shaped structures formed by two consecutive alpha-helices joined by short connections are considered. The L-structures with the alpha m gamma beta/delta alpha n-conformations are of particular value since they usually have Pro residues in the second positions of the second alpha-helices. These structures can be divided into two classes, the right-turned and left-turned L-structures, depending on whether the second alpha-helix is located on the right or the left, relative to the first one when viewed from the hydrophobic core. Stereochemical analysis shows that in an ideal case the left- and right-turned L-structures should have different sequence patterns of hydrophobic, hydrophilic and proline residues. These sequence patterns can be used in the prediction of the L-shaped structures as well as in protein design and engineering.  相似文献   

9.
Abstract A refined prediction of the nicotinic acetylcholine receptor (nAChR) subunits' secondary structure was computed with third-generation algorithms. The four selected programs, PHD, Predator, DSC, and NNSSP, based on different prediction approaches, were applied to each sequence of an alignment of nAChR and 5-HT3 receptor subunits, as well as a larger alignment with related subunit sequences from glycine and GABA receptors. A consensus prediction was computed for the nAChR subunits through a "winner takes all" method. By integrating the probabilities obtained with PHD, DSC, and NNSSP, this prediction was filtered in order to eliminate the singletons and to more precisely establish the structure limits (only 4% of the residues were modified). The final consensus secondary structure includes nine alpha-helices (24.2% of the residues, with an average length of 13.9 residues) and 17 beta-strands (22.5% of the residues, with an average length of 6.6 residues). The large extracellular domain is predicted to be mainly composed of beta-strands, with only two helices at the amino-terminal end. The transmembrane segments are predicted to be in a mixed alpha/beta topology (with a predominance of alpha-helices), with no known equivalent in the current protein database. The cytoplasmic domain is predicted to consist of two well-conserved amphipathic helices joined together by an unfolded stretch of variable length and sequence. In general, the segments predicted to occur in a periodic structure correspond to the more conserved regions, as defined by an analysis of sequence conservation per position performed on 152 superfamily members. The solvent accessibility of each residue was predicted from the multiple alignments with PHDacc. Each segment with more than three exposed residues was assumed to be external to the core protein. Overall, these data constitute an envelope of structural constraints. In a subsequent step, experimental data relative to the extracellular portion of the complete receptor were incorporated into the model. This led to a proposed two-dimensional representation of the secondary structure in which the peptide chain of the extracellular domain winds alternatively between the two interfaces of the subunit. Although this representation is not a tertiary structure and does not lead to predictions of specific beta-beta interaction, it should provide a basic framework for further mutagenesis investigations and for fold recognition (threading) searches.  相似文献   

10.
The profile method, for detecting distantly related proteins by sequence comparison, has been extended to incorporate secondary structure information from known X-ray structures. The sequence of a known structure is aligned to sequences of other members of a given folding class. From the known structure, the secondary structure (alpha-helix, beta-strand or "other") is assigned to each position of the aligned sequences. As in the standard profile method, a position-dependent scoring table, termed a profile, is calculated from the aligned sequences. However, rather than using the standard Dayhoff mutation table in calculating the profile, we use distinct amino acid mutation tables for residues in alpha-helices, beta-strands or other secondary structures to calculate the profile. In addition, we also distinguish between internal and external residues. With this new secondary structure-based profile method, we created a profile for eight-stranded, antiparallel beta barrels of the insecticyanin folding class. It is based on the sequences of retinol-binding protein, insecticyanin and beta-lactoglobulin. Scanning the sequence database with this profile, it was possible to detect the sequence of avidin. The structure of streptavidin is known, and it appears to be distantly related to the antiparallel beta barrels. Also detected is the sequence of complement component C8, which we therefore predict to be a member of this folding class.  相似文献   

11.
Conformational properties of a peptide model for unfolded alpha-helices   总被引:1,自引:0,他引:1  
Models of protein folding often hypothesize that the first step is local secondary structure formation. The assumption is that unfolded polypeptide chains possess an intrinsic propensity to form these local secondary structures. On the basis of this idea, it is tempting to model the local conformational properties of unfolded proteins using well-established residue secondary structure propensities, in particular, alpha-helix forming propensities. We have used spectroscopic methods to investigate the conformational behavior of a host-guest series of peptides designed to model unfolded alpha-helices. A suitable peptide model for unfolded alpha-helices was determined from studies of the length dependence of the conformational properties of alanine-based peptides. The chosen host peptide possessed a small, detectable, alpha-helix content. Substituting various representative guest residues into the central position of the host peptide at times changed the conformational behavior dramatically, and often in ways that could not be predicted from known alpha-helix forming propensities. The data presented can be used to rationalize some of these propensities. However, it is clear that secondary structure propensities cannot be used to predict the local conformational properties of unfolded proteins.  相似文献   

12.
Dynein is a motor ATPase, and the C-terminal two-thirds of its heavy chain form a ring structure. One of protrudings from this ring structure is a stalk whose tip, the dynein stalk head (DSH), is thought to be the microtubule-binding domain. As a first step toward elucidating the functional mechanisms of DSH, we aimed at the NMR structural analysis of an isolated DSH from mouse cytoplasmic dynein. The DSH expressed in bacteria and purified was coprecipitated with microtubules, suggesting its proper folding. Chemical shifts of the DSH were obtained from NMR measurements, and backbone assignment identified 94% of the main-chain N-H signals. Secondary structural prediction programs showed that about 60% of the residues formed alpha-helices. A region with cationic residues K58 and R61 (and possibly R66 as well), and another with R86, K88, K90, and K91, were found to form alpha-helices. Both of these regions may be important in the formation of the DSH-binding site to a microtubule that has a low pI with a number of acidic residues. Two synthetic peptides containing the sequence of the alpha-helix 12 of beta-tubulin, considered to be important in binding to DSH, were investigated. Of these two peptides, the one with higher helix-formation propensity appeared to bind to DSH, since it precipitated with DSH in a nearly stoichiometric manner. This suggested that the alpha-helicity of this region would be important in its binding to DSH.  相似文献   

13.
A few highly charged natural peptide sequences were recently suggested to form stable alpha-helical structures in water. In this article we show that these sequences represent a novel structural motif called "charged single alpha-helix" (CSAH). To obtain reliable candidate CSAH motifs, we developed two conceptually different computational methods capable of scanning large databases: SCAN4CSAH is based on sequence features characteristic for salt bridge stabilized single alpha-helices, whereas FT_CHARGE applies Fourier transformation to charges along sequences. Using the consensus of the two approaches, a remarkable number of proteins were found to contain putative CSAH domains. Recombinant fragments (50-60 residues) corresponding to selected hits obtained by both methods (myosin 6, Golgi resident protein GCP60, and M4K4 protein kinase) were produced and shown by circular dichroism spectroscopy to adopt largely alpha-helical structure in water. CSAH segments differ substantially both from coiled-coil and intrinsically disordered proteins, despite the fact that current prediction methods recognize them as either or both. Analysis of the proteins containing CSAH motif revealed possible functional roles of the corresponding segments. The suggested main functional features include the formation of relatively rigid spacer/connector segments between functional domains as in caldesmon, extension of the lever arm in myosin motors and mediation of transient interactions by promoting dimerization in a range of proteins.  相似文献   

14.
Many protein regions have been shown to be intrinsically disordered, lacking unique structure under physiological conditions. These intrinsically disordered regions are not only very common in proteomes, but also crucial to the function of many proteins, especially those involved in signaling, recognition, and regulation. The goal of this work was to identify the prevalence, characteristics, and functions of conserved disordered regions within protein domains and families. A database was created to store the amino acid sequences of nearly one million proteins and their domain matches from the InterPro database, a resource integrating eight different protein family and domain databases. Disorder prediction was performed on these protein sequences. Regions of sequence corresponding to domains were aligned using a multiple sequence alignment tool. From this initial information, regions of conserved predicted disorder were found within the domains. The methodology for this search consisted of finding regions of consecutive positions in the multiple sequence alignments in which a 90% or more of the sequences were predicted to be disordered. This procedure was constrained to find such regions of conserved disorder prediction that were at least 20 amino acids in length. The results of this work included 3,653 regions of conserved disorder prediction, found within 2,898 distinct InterPro entries. Most regions of conserved predicted disorder detected were short, with less than 10% of those found exceeding 30 residues in length.  相似文献   

15.
Wang J  Feng JA 《Protein engineering》2003,16(11):799-807
This paper reports an extensive sequence analysis of the alpha-helices of proteins. alpha-Helices were extracted from the Protein Data Bank (PDB) and were divided into groups according to their sizes. It was found that some amino acids had differential propensity values for adopting helical conformation in short, medium and long alpha-helices. Pro and Trp had a significantly higher propensity for helical conformation in short helices than in medium and long helices. Trp was the strongest helix conformer in short helices. Sequence patterns favoring helical conformation were derived from a neighbor-dependent sequence analysis of proteins, which calculated the effect of neighboring amino acid type on the propensity of residues for adopting a particular secondary structure in proteins. This method produced an enhanced statistical significance scale that allowed us to explore the positional preference of amino acids for alpha-helical conformations. It was shown that the amino acid pair preference for alpha-helix had a unique pattern and this pattern was not always predictable by assuming proportional contributions from the individual propensity values of the amino acids. Our analysis also yielded a series of amino acid dyads that showed preference for alpha-helix conformation. The data presented in this study, along with our previous study on loop sequences of proteins, should prove useful for developing potential 'codes' for recognizing sequence patterns that are favorable for specific secondary structural elements in proteins.  相似文献   

16.
The GXXXG motif is a frequently occurring sequence of residues that is known to favor helix-helix interactions in membrane proteins. Here we show that the GXXXG motif is also prevalent in soluble proteins whose structures have been determined. Some 152 proteins from a non-redundant PDB set contain at least one alpha-helix with the GXXXG motif, 41 +/- 9% more than expected if glycine residues were uniformly distributed in those alpha-helices. More than 50% of the GXXXG-containing alpha-helices participate in helix-helix interactions. In fact, 26 of those helix-helix interactions are structurally similar to the helix-helix interaction of the glycophorin A dimer, where two transmembrane helices associate to form a dimer stabilized by the GXXXG motif. As for the glycophorin A structure, we find backbone-to-backbone atomic contacts of the C alpha-H...O type in each of these 26 helix-helix interactions that display the stereochemical hallmarks of hydrogen bond formation. These glycophorin A-like helix-helix interactions are enriched in the general set of helix-helix interactions containing the GXXXG motif, suggesting that the inferred C alpha-H...O hydrogen bonds stabilize the helix-helix interactions. In addition to the GXXXG motif, some 808 proteins from the non-redundant PDB set contain at least one alpha-helix with the AXXXA motif (30 +/- 3% greater than expected). Both the GXXXG and AXXXA motifs occur frequently in predicted alpha-helices from 24 fully sequenced genomes. Occurrence of the AXXXA motif is enhanced to a greater extent in thermophiles than in mesophiles, suggesting that helical interaction based on the AXXXA motif may be a common mechanism of thermostability in protein structures. We conclude that the GXXXG sequence motif stabilizes helix-helix interactions in proteins, and that the AXXXA sequence motif also stabilizes the folded state of proteins.  相似文献   

17.
Globular proteins adopt complex folds, composed of organized assemblies of alpha-helix and beta-sheet together with irregular regions that interconnect these scaffold elements. Here, we seek to parse the irregular regions into their structural constituents and to rationalize their formative energetics. Toward this end, we dissected the Protein Coil Library, a structural database of protein segments that are neither alpha-helix nor beta-strand, extracted from high-resolution protein structures. The backbone dihedral angles of residues from coil library segments are distributed indiscriminately across the phi,psi map, but when contoured, seven distinct basins emerge clearly. The structures and energetics associated with the two least-studied basins are the primary focus of this article. Specifically, the structural motifs associated with these basins were characterized in detail and then assessed in simple simulations designed to capture their energetic determinants. It is found that conformational constraints imposed by excluded volume and hydrogen bonding are sufficient to reproduce the observed ,psi distributions of these motifs; no additional energy terms are required. These three motifs in conjunction with alpha-helices, strands of beta-sheet, canonical beta-turns, and polyproline II conformers comprise approximately 90% of all protein structure.  相似文献   

18.
In this paper we present a novel approach to membrane protein secondary structure prediction based on the statistical stepwise discriminant analysis method. A new aspect of our approach is the possibility to derive physical-chemical properties that may affect the formation of membrane protein secondary structure. The certain physical-chemical properties of protein chains can be used to clarify the formation of the secondary structure types under consideration. Another aspect of our approach is that the results of multiple sequence alignment, or the other kinds of sequence alignment, are not used in the frame of the method. Using our approach, we predicted the formation of three main secondary structure types (alpha-helix, beta-structure and coil) with high accuracy, that is Q(3) = 76%. Predicting the formation of alpha-helix and non-alpha-helix states we reached the accuracy which was measured as Q(2) = 86%. Also we have identified certain protein chain properties that affect the formation of membrane protein secondary structure. These protein properties include hydrophobic properties of amino acid residues, presence of Gly, Ala and Val amino acids, and the location of protein chain end.  相似文献   

19.
Van Dorn LO  Newlove T  Chang S  Ingram WM  Cordes MH 《Biochemistry》2006,45(35):10542-10553
In the Cro protein family, an evolutionary change in secondary structure has converted an alpha-helical fold to a mixture of alpha-helix and beta-sheet. P22 Cro and lambda Cro represent the ancestral all-alpha and descendant alpha+beta folds, respectively. The major structural differences between these proteins are at the C-terminal end of the domain (residues 34-56), where two alpha-helices in P22 Cro align with two beta-strands in lambda Cro. We sought to assess the possibility that smooth evolutionary transitions could have converted the all-alpha structure to the alpha+beta structure through sequences that could adopt both folds. First, we used scanning mutagenesis to identify and compare patterns of key stabilizing residues in the C-terminal regions of both P22 Cro and lambda Cro. These patterns exhibited little similarity to each other, with structurally important residues in the two proteins most often occurring at different sequence positions. Second, "hybrid scanning" studies, involving replacement of each wild-type residue in P22 Cro with the aligned wild-type residue in lambda Cro and vice versa, revealed five or six residues in each protein that strongly destabilized the other. These results suggest that key stability determinants for each Cro fold are quite different and that the P22 Cro sequence strongly favors the all-alpha structure while the lambda Cro sequence strongly favors the alpha+beta structure. Nonetheless, we were able to design a "structurally ambivalent" sequence fragment (SASF1), which corresponded to residues 39-56 and simultaneously incorporated most key stabilizing residues for both P22 Cro and lambda Cro. NMR experiments showed SASF1 to stably fold as a beta-hairpin when incorporated into the lambda Cro sequence but as a pair of alpha-helices when incorporated into P22 Cro.  相似文献   

20.
Prediction of the Secondary Structure of Myelin Basic Protein   总被引:14,自引:10,他引:4  
An investigation into the probable secondary structure of the myelin basic protein was carried out by the application of three procedures currently in use to predict the secondary structures of proteins from knowledge of their amino acid sequences. In order to increase the accuracy of the predictions, the amino acid substitutions that occur in the basic protein from different species were incorporated into the predictive algorithms. It was possible to locate regions of probable alpha-helix, beta-structure, beta-turn, and unordered conformation (coil) in the protein. One of the predictive methods introduces a bias into the algorithm to maximize or minimize the amounts of alpha-helix and/or beta-structure present; this made it possible to assess how conditions such as pH and protein concentration or the presence of anionic amphiphilic molecules could influence the protein's secondary structure. The predictions made by the three methods were in reasonably good agreement with one another. They were consistent with experimental data, provided that the stabilizing or destabilizing effects of the environment were taken into account. According to the predictions, the extent of possible alpha-helix and beta-structure formation in the protein s severely restricted by the low frequency and extensive scattering of hydrophobic residues, along with a high frequency and extensive scattering of residues that favor the formation of beta-turns and coils. Neither prolyl residues nor cationic residues per se are responsible for the low content of alpha-helix predicted in the protein. The principal ordered conformation predicted is the beta-turn. Many of the predicted beta-turns overlap extensively, involving in some cases up to 10 residues. In some of these structures it is possible for the peptide backbone to oscillate in a sinusoidal manner, generating a flat, pleated sheetlike structure. Cationic residues located in these structures would appear to be ideally oriented for interaction with lipid phosphate groups located at the cytoplasmic surface of the myelin membrane. An analysis of possible and probable conformations that the triproline sequence could assume questions the popular notion that this sequence produces a hairpin turn in the basic protein.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号