首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
Interresidue protein contacts in proteins structures and at protein-protein interface are classically described by the amino acid types of interacting residues and the local structural context of the contact, if any, is described using secondary structures. In this study, we present an alternate analysis of interresidue contact using local structures defined by the structural alphabet introduced by Camproux et al. This structural alphabet allows to describe a 3D structure as a sequence of prototype fragments called structural letters, of 27 different types. Each residue can then be assigned to a particular local structure, even in loop regions. The analysis of interresidue contacts within protein structures defined using Vorono? tessellations reveals that pairwise contact specificity is greater in terms of structural letters than amino acids. Using a simple heuristic based on specificity score comparison, we find that 74% of the long-range contacts within protein structures are better described using structural letters than amino acid types. The investigation is extended to a set of protein-protein complexes, showing that the similar global rules apply as for intraprotein contacts, with 64% of the interprotein contacts best described by local structures. We then present an evaluation of pairing functions integrating structural letters to decoy scoring and show that some complexes could benefit from the use of structural letter-based pairing functions.  相似文献   

4.
5.
Protein sequence world is considerably larger than structure world. In consequence, numerous non-related sequences may adopt similar 3D folds and different kinds of amino acids may thus be found in similar 3D structures. By grouping together the 20 amino acids into a smaller number of representative residues with similar features, sequence world simplification may be achieved. This clustering hence defines a reduced amino acid alphabet (reduced AAA). Numerous works have shown that protein 3D structures are composed of a limited number of building blocks, defining a structural alphabet. We previously identified such an alphabet composed of 16 representative structural motifs (5-residues length) called Protein Blocks (PBs). This alphabet permits to translate the structure (3D) in sequence of PBs (1D). Based on these two concepts, reduced AAA and PBs, we analyzed the distributions of the different kinds of amino acids and their equivalences in the structural context. Different reduced sets were considered. Recurrent amino acid associations were found in all the local structures while other were specific of some local structures (PBs) (e.g Cysteine, Histidine, Threonine and Serine for the alpha-helix Ncap). Some similar associations are found in other reduced AAAs, e.g Ile with Val, or hydrophobic aromatic residues Trp with Phe and Tyr. We put into evidence interesting alternative associations. This highlights the dependence on the information considered (sequence or structure). This approach, equivalent to a substitution matrix, could be useful for designing protein sequence with different features (for instance adaptation to environment) while preserving mainly the 3D fold.  相似文献   

6.
Olfactory marker protein (OMP) is a ubiquitous, cytoplasmic protein found in mature olfactory receptor neurons of all vertebrates. Electrophysiological and behavioral studies demonstrate that it is a modulator of the olfactory signal transduction pathway. Here, we demonstrate that the solution structure of OMP, as determined by NMR studies, is a single globular domain protein comprised of eight beta-strands forming two beta-sheets oriented orthogonally to one another, thus exhibiting a "beta-clam" or "beta-sandwich" fold: beta-sheet 1 is comprised of beta3-beta8-beta1-beta2 and beta-sheet 2 contains beta6-beta5-beta4-beta7. Insertions include two, long alpha-helices located on opposite sides of the beta-clam and three flexible loops. The juxtaposition of beta-strands beta6-beta5-beta4-beta7-beta2-beta1-beta8-beta3 forms a continuously curved surface and encloses one side of the beta-clam. The "cleft" formed by the two beta-sheets is opposite to the closed end of the beta-clam. Using a peptide titration series, we have identified this cleft as the binding surface for a peptide derived from the Bex1 protein. The highly conserved Omega-loop structure adjacent to the Bex1 peptide-binding surface found in OMP may be the site of additional OMP-protein interactions related to its role in modulating olfactory signal transduction. Thus, the interaction between the OMP and Bex1 proteins could facilitate the interaction between OMP and other components of the olfactory signaling pathway.  相似文献   

7.
Finding structural similarities between proteins often helps reveal shared functionality, which otherwise might not be detected by native sequence information alone. Such similarity is usually detected and quantified by protein structure alignment. Determining the optimal alignment between two protein structures, however, remains a hard problem. An alternative approach is to approximate each three-dimensional protein structure using a sequence of motifs derived from a structural alphabet. Using this approach, structure comparison is performed by comparing the corresponding motif sequences or structural sequences. In this article, we measure the performance of such alphabets in the context of the protein structure classification problem. We consider both local and global structural sequences. Each letter of a local structural sequence corresponds to the best matching fragment to the corresponding local segment of the protein structure. The global structural sequence is designed to generate the best possible complete chain that matches the full protein structure. We use an alphabet of 20 letters, corresponding to a library of 20 motifs or protein fragments having four residues. We show that the global structural sequences approximate well the native structures of proteins, with an average coordinate root mean square of 0.69 Å over 2225 test proteins. The approximation is best for all α-proteins, while relatively poorer for all β-proteins. We then test the performance of four different sequence representations of proteins (their native sequence, the sequence of their secondary-structure elements, and the local and global structural sequences based on our fragment library) with different classifiers in their ability to classify proteins that belong to five distinct folds of CATH. Without surprise, the primary sequence alone performs poorly as a structure classifier. We show that addition of either secondary-structure information or local information from the structural sequence considerably improves the classification accuracy. The two fragment-based sequences perform better than the secondary-structure sequence but not well enough at this stage to be a viable alternative to more computationally intensive methods based on protein structure alignment.  相似文献   

8.
We developed a novel approach for predicting local protein structure from sequence. It relies on the Hybrid Protein Model (HPM), an unsupervised clustering method we previously developed. This model learns three-dimensional protein fragments encoded into a structural alphabet of 16 protein blocks (PBs). Here, we focused on 11-residue fragments encoded as a series of seven PBs and used HPM to cluster them according to their local similarities. We thus built a library of 120 overlapping prototypes (mean fragments from each cluster), with good three-dimensional local approximation, i.e., a mean accuracy of 1.61 A Calpha root-mean-square distance. Our prediction method is intended to optimize the exploitation of the sequence-structure relations deduced from this library of long protein fragments. This was achieved by setting up a system of 120 experts, each defined by logistic regression to optimize the discrimination from sequence of a given prototype relative to the others. For a target sequence window, the experts computed probabilities of sequence-structure compatibility for the prototypes and ranked them, proposing the top scorers as structural candidates. Predictions were defined as successful when a prototype <2.5 A from the true local structure was found among those proposed. Our strategy yielded a prediction rate of 51.2% for an average of 4.2 candidates per sequence window. We also proposed a confidence index to estimate prediction quality. Our approach predicts from sequence alone and will thus provide valuable information for proteins without structural homologs. Candidates will also contribute to global structure prediction by fragment assembly.  相似文献   

9.
We describe a new method for polyproline II-type (PPII) secondary structure prediction based on tetrapeptide conformation properties using data obtained from all globular proteins in the Protein Data Bank (PDB). This is the first method for PPII prediction with a relatively high level of accuracy (approximately 60%). Our method uses only frequencies of different conformations among oligopeptides without any additional parameters. We also attempted to predict alpha-helices and beta-strands using the same approach. We find that the application of our method reveals interrelation between sequence and structure even for very short oligopeptides (tetrapeptides).  相似文献   

10.
The structural annotation of proteins with no detectable homologs of known 3D structure identified using sequence‐search methods is a major challenge today. We propose an original method that computes the conditional probabilities for the amino‐acid sequence of a protein to fit to known protein 3D structures using a structural alphabet, known as “Protein Blocks” (PBs). PBs constitute a library of 16 local structural prototypes that approximate every part of protein backbone structures. It is used to encode 3D protein structures into 1D PB sequences and to capture sequence to structure relationships. Our method relies on amino acid occurrence matrices, one for each PB, to score global and local threading of query amino acid sequences to protein folds encoded into PB sequences. It does not use any information from residue contacts or sequence‐search methods or explicit incorporation of hydrophobic effect. The performance of the method was assessed with independent test datasets derived from SCOP 1.75A. With a Z‐score cutoff that achieved 95% specificity (i.e., less than 5% false positives), global and local threading showed sensitivity of 64.1% and 34.2%, respectively. We further tested its performance on 57 difficult CASP10 targets that had no known homologs in PDB: 38 compatible templates were identified by our approach and 66% of these hits yielded correctly predicted structures. This method scales‐up well and offers promising perspectives for structural annotations at genomic level. It has been implemented in the form of a web‐server that is freely available at http://www.bo‐protscience.fr/forsa .  相似文献   

11.
Predicting accurate fragments from sequence has recently become a critical step for protein structure modeling, as protein fragment assembly techniques are presently among the most efficient approaches for de novo prediction. A key step in these approaches is, given the sequence of a protein to model, the identification of relevant fragments - candidate fragments - from a collection of the available 3D structures. These fragments can then be assembled to produce a model of the complete structure of the protein of interest. The search for candidate fragments is classically achieved by considering local sequence similarity using profile comparison, or threading approaches. In the present study, we introduce a new profile comparison approach that, instead of using amino acid profiles, is based on the use of predicted structural alphabet profiles, where structural alphabet profiles contain information related to the 3D local shapes associated with the sequences. We show that structural alphabet profile-profile comparison can be used efficiently to retrieve accurate structural fragments, and we introduce a fully new protocol for the detection of candidate fragments. It identifies fragments specific of each position of the sequence and of size varying between 6 and 27 amino-acids. We find it outperforms present state of the art approaches in terms (i) of the accuracy of the fragments identified, (ii) the rate of true positives identified, while having a high coverage score. We illustrate the relevance of the approach on complete target sets of the two previous Critical Assessment of Techniques for Protein Structure Prediction (CASP) rounds 9 and 10. A web server for the approach is freely available at http://bioserv.rpbs.univ-paris-diderot.fr/SAFrag.  相似文献   

12.
Left-handed polyproline II (PPII) helices commonly occur in globular proteins in segments of 4-8 residues. This paper analyzes the structural conservation of PPII-helices in 3 protein families: serine proteinases, aspartic proteinases, and immunoglobulin constant domains. Calculations of the number of conserved segments based on structural alignment of homologous molecules yielded similar results for the PPII-helices, the alpha-helices, and the beta-strands. The PPII-helices are consistently conserved at the level of 100-80% in the proteins with sequence identity above 20% and RMS deviation of structure alignments below 3.0 A. The most structurally important PPII segments are conserved below this level of sequence identity. These results suggest that the PPII-helices, in addition to the other 2 secondary structure classes, should be identified as part of structurally conserved regions in proteins. This is supported by similar values for the local RMS deviations of the aligned segments for the structural classes of PPII-helices, alpha-helices, and beta-strands. The PPII-helices are shown to participate in supersecondary elements such as PPII-helix/alpha-helix. The conservation of PPII-helices depends on the conservation of a supersecondary element as a whole. PPII-helices also form links, possibly flexible, in the interdomain regions. The role of the PPII-helices in model building by homology is 2-fold; they serve as additional conserved elements in the structure allowing improvement of the accuracy of a model and provide correct chain geometry for modeling of the segments equivalenced to them in a target sequence. The improvement in model building is demonstrated in 2 test studies.  相似文献   

13.
14.
15.
We present a thorough analysis of the relation between amino acid sequence and local three-dimensional structure in proteins. A library of overlapping local structural prototypes was built using an unsupervised clustering approach called “hybrid protein model” (HPM). The HPM carries out a multiple structural alignment of local folds from a non-redundant protein structure databank encoded into a structural alphabet composed of 16 protein blocks (PBs). Following previous research focusing on the HPM protocol, we have considered gaps in the local structure prototype. This methodology allows to have variable length fragments. Hence, 120 local structure prototypes were obtained. Twenty-five percent of the protein fragments learnt by HPM had gaps.An investigation of tight turns suggested that they are mainly derived from three PB series with precise locations in the HPM. The amino acid information content of the whole conformational classes was tackled by multivariate methods, e.g., canonical correlation analysis. It points out the presence of seven amino acid equivalence classes showing high propensities for preferential local structures. In the same way, definition of “contrast factors” based on sequence-structure properties underline the specificity of certain structural prototypes, e.g., the dependence of Gly or Asn-rich turns to a limited number of PBs, or, the opposition between Pro-rich coils to those enriched in Ser, Thr, Asn and Glu. These results are so useful to analyze the sequence-structure relationships, but could also be used to improve fragment-based method for protein structure prediction from sequence.  相似文献   

16.
Three-dimensional protein structures can be described with a library of 3D fragments that define a structural alphabet. We have previously proposed such an alphabet, composed of 16 patterns of five consecutive amino acids, called Protein Blocks (PBs). These PBs have been used to describe protein backbones and to predict local structures from protein sequences. The Q16 prediction rate reaches 40.7% with an optimization procedure. This article examines two aspects of PBs. First, we determine the effect of the enlargement of databanks on their definition. The results show that the geometrical features of the different PBs are preserved (local RMSD value equal to 0.41 A on average) and sequence-structure specificities reinforced when databanks are enlarged. Second, we improve the methods for optimizing PB predictions from sequences, revisiting the optimization procedure and exploring different local prediction strategies. Use of a statistical optimization procedure for the sequence-local structure relation improves prediction accuracy by 8% (Q16 = 48.7%). Better recognition of repetitive structures occurs without losing the prediction efficiency of the other local folds. Adding secondary structure prediction improved the accuracy of Q16 by only 1%. An entropy index (Neq), strongly related to the RMSD value of the difference between predicted PBs and true local structures, is proposed to estimate prediction quality. The Neq is linearly correlated with the Q16 prediction rate distributions, computed for a large set of proteins. An "expected" prediction rate QE16 is deduced with a mean error of 5%.  相似文献   

17.
The crystal structure of the olfactory marker protein at 2.3 A resolution   总被引:1,自引:0,他引:1  
Olfactory marker protein (OMP) is a highly expressed and phylogenetically conserved cytoplasmic protein of unknown function found almost exclusively in mature olfactory sensory neurons. Electrophysiological studies of olfactory epithelia in OMP knock-out mice show strongly retarded recovery following odorant stimulation leading to an impaired response to pulsed odor stimulation. Although these studies show that OMP is a modulator of the olfactory signal-transduction cascade, its biochemical role is not established. In order to facilitate further studies on the molecular function of OMP, its crystal structure has been determined at 2.3 A resolution using multiwavelength anomalous diffraction experiments on selenium-labeled protein. OMP is observed to form a modified beta-clamshell structure with eight antiparallel beta-strands. While OMP has no significant sequence homology to proteins of known structure, it has a similar fold to a domain found in a variety of existing structures, including in a large family of viral capsid proteins. The surface of OMP is mostly convex and lacking obvious small molecule binding sites, suggesting that it is more likely to be involved in modulating protein-protein interaction than in interacting with small molecule ligands. Three highly conserved regions have been identified as leading candidates for protein-protein interaction sites in OMP. One of these sites represents a loop known to mediate ligand interactions in the structurally homologous EphB2 receptor ligand-binding domain. This site is partially buried in the crystal structure but fully exposed in the NMR solution structure of OMP due to a change in the orientation of an alpha-helix that projects outward from the structurally invariant beta-clamshell core. Gating of this conformational change by molecular interactions in the signal-transduction cascade could be used to control access to OMP's equivalent of the EphB2 ligand-interaction loop, thereby allowing OMP to function as a molecular switch.  相似文献   

18.
Protein backbone angle prediction with machine learning approaches   总被引:2,自引:0,他引:2  
MOTIVATION: Protein backbone torsion angle prediction provides useful local structural information that goes beyond conventional three-state (alpha, beta and coil) secondary structure predictions. Accurate prediction of protein backbone torsion angles will substantially improve modeling procedures for local structures of protein sequence segments, especially in modeling loop conformations that do not form regular structures as in alpha-helices or beta-strands. RESULTS: We have devised two novel automated methods in protein backbone conformational state prediction: one method is based on support vector machines (SVMs); the other method combines a standard feed-forward back-propagation artificial neural network (NN) with a local structure-based sequence profile database (LSBSP1). Extensive benchmark experiments demonstrate that both methods have improved the prediction accuracy rate over the previously published methods for conformation state prediction when using an alphabet of three or four states. AVAILABILITY: LSBSP1 and the NN algorithm have been implemented in PrISM.1, which is available from www.columbia.edu/~ay1/. SUPPLEMENTARY INFORMATION: Supplementary data for the SVM method can be downloaded from the Website www.cs.columbia.edu/compbio/backbone.  相似文献   

19.
Local structures in denatured proteins may be important in guiding a polypeptide chain during the folding and misfolding processes. Existence of local structures in chemically denatured proteins is a highly controversial issue. NMR parameters [coupling constants (3) J(H(alpha),H(N)) and chemical shifts] of chemically denatured proteins in general deviate little from their values in small peptides. These peptides were presumed to be completely unstructured; therefore, it was considered that chemically denatured proteins are random coils. But recent experimental studies show that small peptides adopt relatively stable structures in aqueous solutions. Small deviations of the NMR parameters from their values in small peptides may thus actually indicate the existence of local structures in chemically denatured proteins. Using NMR data and theoretical predictions we show here that fluctuating beta-strands exist in urea-denatured ubiquitin (8 M urea at pH 2). Residues in such beta-strands populate more frequently the left side of the broad beta region of -psi space. Urea-denatured ubiquitin contains no detectable beta-sheet secondary structures; nevertheless, the fluctuating beta-strands in urea-denatured ubiquitin coincide to the beta-strands in the native state. Formation of beta-strands is in accord with the electrostatic screening model of unfolded proteins. The free energy of a residue in an unfolded protein is in this model determined by the local backbone electrostatics and its screening by backbone solvation. These energy terms introduce strong electrostatic coupling between neighboring residues, which causes cooperative formation of beta-strands in denatured proteins. We propose that fluctuating beta-strands in denatured proteins may serve as initiation sites to form fibrils.  相似文献   

20.
Atu4866 is a 79-residue conserved hypothetical protein of unknown function from Agrobacterium tumefaciens. Protein sequence alignments show that it shares > or =60% sequence identity with 20 other hypothetical proteins of bacterial origin. However, the structures and functions of these proteins remain unknown so far. To gain insight into the function of this family of proteins, we have determined the structure of Atu4866 as a target of a structural genomics project using solution NMR spectroscopy. Our results reveal that Atu4866 adopts a streptavidin-like fold featuring a beta-barrel/sandwich formed by eight antiparallel beta-strands. Further structural analysis identified a continuous patch of conserved residues on the surface of Atu4866 that may constitute a potential ligand-binding site.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号