首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We present a comprehensive evaluation of a new structure mining method called PB-ALIGN. It is based on the encoding of protein structure as 1D sequence of a combination of 16 short structural motifs or protein blocks (PBs). PBs are short motifs capable of representing most of the local structural features of a protein backbone. Using derived PB substitution matrix and simple dynamic programming algorithm, PB sequences are aligned the same way amino acid sequences to yield structure alignment. PBs are short motifs capable of representing most of the local structural features of a protein backbone. Alignment of these local features as sequence of symbols enables fast detection of structural similarities between two proteins. Ability of the method to characterize and align regions beyond regular secondary structures, for example, N and C caps of helix and loops connecting regular structures, puts it a step ahead of existing methods, which strongly rely on secondary structure elements. PB-ALIGN achieved efficiency of 85% in extracting true fold from a large database of 7259 SCOP domains and was successful in 82% cases to identify true super-family members. On comparison to 13 existing structure comparison/mining methods, PB-ALIGN emerged as the best on general ability test dataset and was at par with methods like YAKUSA and CE on nontrivial test dataset. Furthermore, the proposed method performed well when compared to flexible structure alignment method like FATCAT and outperforms in processing speed (less than 45 s per database scan). This work also establishes a reliable cut-off value for the demarcation of similar folds. It finally shows that global alignment scores of unrelated structures using PBs follow an extreme value distribution. PB-ALIGN is freely available on web server called Protein Block Expert (PBE) at http://bioinformatics.univ-reunion.fr/PBE/.  相似文献   

2.
The extrinsic proteins of photosystem II of higher plants and green algae PsbO, PsbP, PsbQ, and PsbR are essential for stable oxygen production in the oxygen evolving center. In the available X‐ray crystallographic structure of higher plant PsbQ residues S14‐Y33 are missing. Building on the backbone NMR assignment of PsbQ, which includes this “missing link”, we report the extended resonance assignment including side chain atoms. Based on nuclear Overhauser effect spectra a high resolution solution structure of PsbQ with a backbone RMSD of 0.81 Å was obtained from torsion angle dynamics. Within the N‐terminal residues 1–45 the solution structure deviates significantly from the X‐ray crystallographic one, while the four‐helix bundle core found previously is confirmed. A short α‐helix is observed in the solution structure at the location where a β‐strand had been proposed in the earlier crystallographic study. NMR relaxation data and unrestrained molecular dynamics simulations corroborate that the N‐terminal region behaves as a flexible tail with a persistent short local helical secondary structure, while no indications of forming a β‐strand are found. Proteins 2015; 83:1677–1686. © 2015 The Authors. Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.  相似文献   

3.
Similarity of protein structures has been analyzed using three-dimensional Delaunay triangulation patterns derived from the backbone representation. It has been found that structurally related proteins have a common spatial invariant part, a set of tetrahedrons, mathematically described as a common spatial subgraph volume of the three-dimensional contact graph derived from Delaunay tessellation (DT). Based on this property of protein structures, we present a novel common volume superimposition (TOPOFIT) method to produce structural alignments. Structural alignments usually evaluated by a number of equivalent (aligned) positions (N(e)) with corresponding root mean square deviation (RMSD). The superimposition of the DT patterns allows one to uniquely identify a maximal common number of equivalent residues in the structural alignment. In other words, TOPOFIT identifies a feature point on the RMSD N(e) curve, a topomax point, until which the topologies of two structures correspond to each other, including backbone and interresidue contacts, whereas the growing number of mismatches between the DT patterns occurs at larger RMSD (N(e)) after the topomax point. It has been found that the topomax point is present in all alignments from different protein structural classes; therefore, the TOPOFIT method identifies common, invariant structural parts between proteins. The alignments produced by the TOPOFIT method have a good correlation with alignments produced by other current methods. This novel method opens new opportunities for the comparative analysis of protein structures and for more detailed studies on understanding the molecular principles of tertiary structure organization and functionality. The TOPOFIT method also helps to detect conformational changes, topological differences in variable parts, which are particularly important for studies of variations in active/ binding sites and protein classification.  相似文献   

4.
In this study we classified regions of random coil into four types: coil between alpha helix and beta strand, coil between beta strand and alpha helix, coil between two alpha helices and coil between two beta strands. This classification may be considered as natural. We used 610 3D structures of proteins collected from the Protein Data Bank from bacteria with low, average and high genomic GC-content. Relatively short regions of coil are not random: certain amino acid residues are more or less frequent in each of the types of coil. Namely, hydrophobic amino acids with branched side chains (Ile, Val and Leu) are rare in coil between two beta strands, unlike some acrophilic amino acids (Asp, Asn and Gly). In contrast, coil between two alpha helices is enriched by Leu. Regions of coil between alpha helix and beta strand are enriched by positively charged amino acids (Arg and Lys), while the usage of residues with side chains possessing hydroxyl group (Ser and Thr) is low in them, in contrast to the regions of coil between beta strand and alpha helix. Regions of coil between beta strand and alpha helix are significantly enriched by Cys residues. The response to the symmetric mutational pressure (AT-pressure or GC-pressure) is also quite different for four types of coil. The most conserved regions of coil are “connecting bridges” between beta strand and alpha helix, since their amino acid content shows less strong dependence on GC-content of genes than amino acid contents of other three types of coil. Possible causes and consequences of the described differences in amino acid content distribution between different types of random coil have been discussed.  相似文献   

5.
Although most proteins conform to the classical one‐structure/one‐function paradigm, an increasing number of proteins with dual structures and functions have been discovered. In response to cellular stimuli, such proteins undergo structural changes sufficiently dramatic to remodel even their secondary structures and domain organization. This “fold‐switching” capability fosters protein multi‐functionality, enabling cells to establish tight control over various biochemical processes. Accurate predictions of fold‐switching proteins could both suggest underlying mechanisms for uncharacterized biological processes and reveal potential drug targets. Recently, we developed a prediction method for fold‐switching proteins using structure‐based thermodynamic calculations and discrepancies between predicted and experimentally determined protein secondary structure (Porter and Looger, Proc Natl Acad Sci U S A 2018; 115:5968–5973). Here we seek to leverage the negative information found in these secondary structure prediction discrepancies. To do this, we quantified secondary structure prediction accuracies of 192 known fold‐switching regions (FSRs) within solved protein structures found in the Protein Data Bank (PDB). We find that the secondary structure prediction accuracies for these FSRs vary widely. Inaccurate secondary structure predictions are strongly associated with fold‐switching proteins compared to equally long segments of non‐fold‐switching proteins selected at random. These inaccurate predictions are enriched in helix‐to‐strand and strand‐to‐coil discrepancies. Finally, we find that most proteins with inaccurate secondary structure predictions are underrepresented in the PDB compared with their alternatively folded cognates, suggesting that unequal representation of fold‐switching conformers within the PDB could be an important cause of inaccurate secondary structure predictions. These results demonstrate that inconsistent secondary structure predictions can serve as a useful preliminary marker of fold switching.  相似文献   

6.
The three-dimensional solution structure of apo rabbit lung calcyclin has been refined to high resolution through the use of heteronuclear NMR spectroscopy and 13C,15N- enriched protein. Upon completing the assignment of virtually all of the 15N, 13C and 1H NMR resonances, the solution structure was determined from a combination of 2814 NOE- derived distance constraints, and 272 torsion angle constraints derived from scalar couplings. A large number of critical inter- subunit NOEs (386) were identified from 13C- select,13C-filtered NOESY experiments, providing a highly accurate dimer interface. The combination of distance geometry and restrained molecular dynamics calculations yielded structures with excellent agreement with the experimental data and high precision (rmsd from the mean for the backbone atoms in the eight helices: 0.33 Å). Calcyclin exhibits a symmetric dimeric fold of two identical 90 amino acid subunits, characteristic of the S100 subfamily of EF-hand Ca2+-binding proteins. The structure reveals a readily identified pair of putative sites for binding of Zn2+. In order to accurately determine the structural features that differentiate the various S100 proteins, distance difference matrices and contact maps were calculated for the NMR structural ensembles of apo calcyclin and rat and bovine S100B. These data show that the most significant variations among the structures are in the positioning of helix III and in loops, the regions with least sequence similarity. Inter-helical angles and distance differences for the proteins show that the positioning of helix III of calcyclin is most similar to that of bovine S100B, but that the helix interfaces are more closely packed in calcyclin than in either S100B structure. Surprisingly large differences were found in the positioning of helix III in the two S100B structures, despite there being only four non-identical residues, suggesting that one or both of the S100B structures requires further refinement.  相似文献   

7.
The structural annotation of proteins with no detectable homologs of known 3D structure identified using sequence‐search methods is a major challenge today. We propose an original method that computes the conditional probabilities for the amino‐acid sequence of a protein to fit to known protein 3D structures using a structural alphabet, known as “Protein Blocks” (PBs). PBs constitute a library of 16 local structural prototypes that approximate every part of protein backbone structures. It is used to encode 3D protein structures into 1D PB sequences and to capture sequence to structure relationships. Our method relies on amino acid occurrence matrices, one for each PB, to score global and local threading of query amino acid sequences to protein folds encoded into PB sequences. It does not use any information from residue contacts or sequence‐search methods or explicit incorporation of hydrophobic effect. The performance of the method was assessed with independent test datasets derived from SCOP 1.75A. With a Z‐score cutoff that achieved 95% specificity (i.e., less than 5% false positives), global and local threading showed sensitivity of 64.1% and 34.2%, respectively. We further tested its performance on 57 difficult CASP10 targets that had no known homologs in PDB: 38 compatible templates were identified by our approach and 66% of these hits yielded correctly predicted structures. This method scales‐up well and offers promising perspectives for structural annotations at genomic level. It has been implemented in the form of a web‐server that is freely available at http://www.bo‐protscience.fr/forsa .  相似文献   

8.
The amino acid sequences of soluble, ordered proteins with stable structures have evolved due to biological and physical requirements, thus distinguishing them from random sequences. Previous analyses have focused on extracting the features that frequently appear in protein substructures, such as α‐helix and β‐sheet, but the universal features of protein sequences have not been addressed. To clarify the differences between native protein sequences and random sequences, we analyzed 7368 soluble, ordered protein sequences, by inspecting the observed and expected occurrences of 400 amino acid pairs in local proximity, up to 10 residues along the sequence in comparison with their expected occurrence in random sequence. We found the trend that the hydrophobic residue pairs and the polar residue pairs are significantly decreased, whereas the pairs between a hydrophobic residue and a polar residue are increased. This trend was universally observed regardless of the secondary structure content but was not observed in protein sequences that include intrinsically disordered regions, indicating that it can be a general rule of protein foldability. The possible benefits of this rule are discussed from the viewpoints of protein aggregation and disorder, which are both caused by low‐complexity regions of hydrophobic or polar residues.  相似文献   

9.
Mooney SD  Liang MH  DeConde R  Altman RB 《Proteins》2005,61(4):741-747
A primary challenge for structural genomics is the automated functional characterization of protein structures. We have developed a sequence-independent method called S-BLEST (Structure-Based Local Environment Search Tool) for the annotation of previously uncharacterized protein structures. S-BLEST encodes the local environment of an amino acid as a vector of structural property values. It has been applied to all amino acids in a nonredundant database of protein structures to generate a searchable structural resource. Given a query amino acid from an experimentally determined or modeled structure, S-BLEST quickly identifies similar amino acid environments using a K-nearest neighbor search. In addition, the method gives an estimation of the statistical significance of each result. We validated S-BLEST on X-ray crystal structures from the ASTRAL 40 nonredundant dataset. We then applied it to 86 crystallographically determined proteins in the protein data bank (PDB) with unknown function and with no significant sequence neighbors in the PDB. S-BLEST was able to associate 20 proteins with at least one local structural neighbor and identify the amino acid environments that are most similar between those neighbors.  相似文献   

10.
Circular dichroism (CD) spectroscopy is a valuable method for defining canonical secondary structure contents of proteins based on empirically‐defined spectroscopic signatures derived from proteins with known three‐dimensional structures. Many proteins identified as being “Intrinsically Disordered Proteins” have a significant amount of their structure that is neither sheet, helix, nor turn; this type of structure is often classified by CD as “other”, “random coil”, “unordered”, or “disordered”. However the “other” category can also include polyproline II (PPII)‐type structures, whose spectral properties have not been well‐distinguished from those of unordered structures. In this study, synchrotron radiation circular dichroism spectroscopy was used to investigate the spectral properties of collagen and polyproline, which both contain PPII‐type structures. Their native spectra were compared as representatives of PPII structures. In addition, their spectra before and after treatment with various conditions to produce unfolded or denatured structures were also compared, with the aim of defining the differences between CD spectra of PPII and disordered structures. We conclude that the spectral features of collagen are more appropriate than those of polyproline for use as the representative spectrum for PPII structures present in typical amino acid‐containing proteins, and that the single most characteristic spectroscopic feature distinguishing a PPII structure from a disordered structure is the presence of a positive peak around 220nm in the former but not in the latter. These spectra are now available for inclusion in new reference data sets used for CD analyses of the secondary structures of soluble proteins.  相似文献   

11.
Analysis of protein structures based on backbone structural patterns known as structural alphabets have been shown to be very useful. Among them, a set of 16 pentapeptide structural motifs known as protein blocks (PBs) has been identified and upon which backbone model of most protein structures can be built. PBs allows simplification of 3D space onto 1D space in the form of sequence of PBs. Here, for the first time, substitution probabilities of PBs in a large number of aligned homologous protein structures have been studied and are expressed as a simplified 16 x 16 substitution matrix. The matrix was validated by benchmarking how well it can align sequences of PBs rather like amino acid alignment to identify structurally equivalent regions in closely or distantly related proteins using dynamic programming approach. The alignment results obtained are very comparable to well established structure comparison methods like DALI and STAMP. Other interesting applications of the matrix have been investigated. We first show that, in variable regions between two superimposed homologous proteins, one can distinguish between local conformational differences and rigid-body displacement of a conserved motif by comparing the PBs and their substitution scores. Second, we demonstrate, with the example of aspartic proteinases, that PBs can be efficiently used to detect the lobe/domain flexibility in the multidomain proteins. Lastly, using protein kinase as an example, we identify regions of conformational variations and rigid body movements in the enzyme as it is changed to the active state from an inactive state.  相似文献   

12.
Coiled‐coils are essential components of many protein complexes. First discovered in structural proteins such as keratins, they have since been found to figure largely in the assembly and dynamics required for diverse functions, including membrane fusion, signal transduction and motors. Coiled‐coils have a characteristic repeating seven‐residue geometric and sequence motif, which is sometimes interrupted by the insertion of one or more residues. Such insertions are often highly conserved and critical to interdomain communication in signaling proteins such as bacterial histidine kinases. Here we develop the “accommodation index” as a parameter that allows automatic detection and classification of insertions based on the three dimensional structure of a protein. This method allows precise identification of the type of insertion and the “accommodation length” over which the insertion is structurally accommodated. A simple theory is presented that predicts the structural perturbations of 1, 3, 4 residue insertions as a function of the length over which the insertion is accommodated. Analysis of experimental structures is in good agreement with theory, and shows that short accommodation lengths give rise to greater perturbation of helix packing angles, changes in local helical phase, and increased structural asymmetry relative to long accommodation lengths. Cytoplasmic domains of histidine kinases in different signaling states display large changes in their accommodation lengths, which can now be seen to underlie diverse structural transitions including symmetry/asymmetry and local variations in helical phase that accompany signal transduction.  相似文献   

13.
The Alacoil is an antiparallel (rather than the usual parallel) coiled-coil of α-helices with Ala or another small residue in every seventh position, allowing a very close spacing of the helices (7.5–8.5 Å between local helix axes), often over four or five helical turns. It occurs in two distinct types that differ by which position of the heptad repeat is occupied by Ala and by whether the closest points on the backbone of the two helices are aligned or are offset by half a turn. The aligned, or ROP, type has Ala in position “d” of the heptad repeat, which occupies the “tip-to-tip” side of the helix contact where the Cα–Cβ bonds point toward each other. The more common offset, or ferritin, type of Alacoil has Ala in position “a” of the heptad repeat (where the Cα-Cβ bonds lie back-to-back, on the “knuckle-touch” side of the helix contact), and the backbones of the two helices are offset vertically by half a turn. In both forms, successive layers of contact have the Ala first on one and then on the other helix. The Alacoil structure has much in common with the coiled-coils of fibrous proteins or leucine zippers: both are α-helical coiled-coils, with a critical amino acid repeated every seven residues (the Leu or the Ala) and a secondary contact position in between. However, Leu zippers are between aligned, parallel helices (often identical, in dimers), whereas Alacoils are between antiparallel helices, usually offset, and much closer together. The Alacoil, then, could be considered as an “Ala anti-zipper.” Leu zippers have a classic “knobs-into-holes” packing of the Leu side chain into a diamond of four residues on the opposite helix; for Alacoils, the helices are so close together that the Ala methyl group must choose one side of the diamond and pack inside a triangle of residues on the other helix. We have used the ferritin-type Alacoil as the basis for the de novo design of a 66-residue, coiled helix hairpin called “Alacoilin.” Its sequence is: cmSP DQWDKE A AQYDAHA QE FEKKS HRNng TPEA DQYRHM A SQY QAMA QK LKAIA NQLKK Gseter (with “a” heptad positions underlined and nonhelical parts in lowercase), which we will produce and test for both stability and uniqueness of structure.  相似文献   

14.
Analysis of the conformational distribution of polypeptide segments in a conformational space is the first step for understanding a principle of structural diversity of proteins. Here, we present a statistical analysis of protein local structures based on interatomic C(alpha) distances. Using principal component analysis (PCA) on the intrasegment C(alpha)-C(alpha) atomic distances, the conformational space of protein segments, which we call the protein segment universe, has been visualized, and three essential coordinate axes, suitable for describing the universe, have been identified. Three essential axes specified radius of gyration, structural symmetry, and separation of hairpin structures from other structures. Among the segments of arbitrary length, 6-22 residues long, the conservation of those axes was uncovered. Further application of PCA to the two largest clusters in the universe revealed local structural motifs. Although some of motifs have already been reported, we identified a possibly novel strand motif. We also showed that a capping box, which is one of the helix capping motifs, was separated into independent subclusters based on the C(alpha) geometry. Implications of the strand motif, which may play a role for protein-protein interaction, are discussed. The currently proposed method is useful for not only mapping the immense universe of protein structures but also identification of structural motifs.  相似文献   

15.
Haipeng Gong 《Proteins》2017,85(12):2162-2169
Helix‐helix interactions are crucial in the structure assembly, stability and function of helix‐rich proteins including many membrane proteins. In spite of remarkable progresses over the past decades, the accuracy of predicting protein structures from their amino acid sequences is still far from satisfaction. In this work, we focused on a simpler problem, the prediction of helix‐helix interactions, the results of which could facilitate practical protein structure prediction by constraining the sampling space. Specifically, we started from the noisy 2D residue contact maps derived from correlated residue mutations, and utilized ridge detection to identify the characteristic residue contact patterns for helix‐helix interactions. The ridge information as well as a few additional features were then fed into a machine learning model HHConPred to predict interactions between helix pairs. In an independent test, our method achieved an F‐measure of ~60% for predicting helix‐helix interactions. Moreover, although the model was trained mainly using soluble proteins, it could be extended to membrane proteins with at least comparable performance relatively to previous approaches that were generated purely using membrane proteins. All data and source codes are available at http://166.111.152.91/Downloads.html or https://github.com/dpxiong/HHConPred .  相似文献   

16.
《Proteins》2018,86(3):273-278
Unusual local arrangements of protein in Ramachandran space are not well represented by standard geometry tools used in either protein structure refinement using simple harmonic geometry restraints or in protein simulations using molecular mechanics force fields. In contrast, quantum chemical computations using small poly‐peptide molecular models can predict accurate geometries for any well‐defined backbone Ramachandran orientation. For conformations along transition regions—ϕ from −60 to 60°—a very good agreement with representative high‐resolution experimental X‐ray (≤1.5 Å) protein structures is obtained for both backbone C−1‐N‐Cα angle and the nonbonded O−1…C distance, while “standard geometry” leads to the “clashing” of O…C atoms and Amber FF99SB predicts distances too large by about 0.15 Å. These results confirm that quantum chemistry computations add valuable support for detailed analysis of local structural arrangements in proteins, providing improved or missing data for less understood high‐energy or unusual regions.  相似文献   

17.
Point mutations in proteins can have different effects on protein stability depending on the mechanism of unfolding. In the most interesting case of I27, the Ig‐like module of the muscle protein titin, one point mutation (Y9P) yields opposite effects on protein stability during denaturant‐induced “global unfolding” versus “vectorial unfolding” by mechanical pulling force or cellular unfolding systems. Here, we assessed the reason for the different effects of the Y9P mutation of I27 on the overall molecular stability and N‐terminal unraveling by NMR. We found that the Y9P mutation causes a conformational change that is transmitted through β‐sheet structures to reach the central hydrophobic core in the interior and alters its accessibility to bulk solvent, which leads to destabilization of the hydrophobic core. On the other hand, the Y9P mutation causes a bend in the backbone structure, which leads to the formation of a more stable N‐terminal structure probably through enhanced hydrophobic interactions.  相似文献   

18.
The abundant existence of proteins and regions that possess specific functions without being uniquely folded into unique 3D structures has become accepted by a significant number of protein scientists. Sequences of these intrinsically disordered proteins (IDPs) and IDP regions (IDPRs) are characterized by a number of specific features, such as low overall hydrophobicity and high net charge which makes these proteins predictable. IDPs/IDPRs possess large hydrodynamic volumes, low contents of ordered secondary structure, and are characterized by high structural heterogeneity. They are very flexible, but some may undergo disorder to order transitions in the presence of natural ligands. The degree of these structural rearrangements varies over a very wide range. IDPs/IDPRs are tightly controlled under the normal conditions and have numerous specific functions that complement functions of ordered proteins and domains. When lacking proper control, they have multiple roles in pathogenesis of various human diseases. Gaining structural and functional information about these proteins is a challenge, since they do not typically “freeze” while their “pictures are taken.” However, despite or perhaps because of the experimental challenges, these fuzzy objects with fuzzy structures and fuzzy functions are among the most interesting targets for modern protein research. This review briefly summarizes some of the recent advances in this exciting field and considers some of the basic lessons learned from the analysis of physics, chemistry, and biology of IDPs.  相似文献   

19.
The armadillo domain is a right‐handed super‐helix of repeating units composed of three α‐helices each. Armadillo repeat proteins (ArmRPs) are frequently involved in protein–protein interactions, and because of their modular recognition of extended peptide regions they can serve as templates for the design of artificial peptide binding scaffolds. On the basis of sequential and structural analyses, different consensus‐designed ArmRPs were synthesized and show high thermodynamic stabilities, compared to naturally occurring ArmRPs. We determined the crystal structures of four full‐consensus ArmRPs with three or four identical internal repeats and two different designs for the N‐ and C‐caps. The crystal structures were refined at resolutions ranging from 1.80 to 2.50 Å for the above mentioned designs. A redesign of our initial caps was required to obtain well diffracting crystals. However, the structures with the redesigned caps caused domain swapping events between the N‐caps. To prevent this domain swap, 9 and 6 point mutations were introduced in the N‐ and C‐caps, respectively. Structural and biophysical analysis showed that this subsequent redesign of the N‐cap prevented domain swapping and improved the thermodynamic stability of the proteins. We systematically investigated the best cap combinations. We conclude that designed ArmRPs with optimized caps are intrinsically stable and well‐expressed monomeric proteins and that the high‐resolution structures provide excellent structural templates for the continuation of the design of sequence‐specific modular peptide recognition units based on armadillo repeats.  相似文献   

20.
β-Strands as constituents of β-pleated sheets in protein tertiary structures often display considerable distortion from a purely extended conformation. The dislocation types are often characterized as “bulging,” “twisting,” and “bending.” The former 2 properties have been extensively studied and classified. In this work an investigation of bent β-structures is undertaken. The structural characteristics examined included the bending angles within and out of the principal strand plane, their distribution among various strand types such as parallel and antiparallel, the amino acid preferences at bend sites, and the usage of charged and polar residues for stabilization through interactive anchoring with other atoms of the β-sheet within which the bent strand lies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号