期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Immunoglobulin fold and tandem repeat structures in proteoglycan N-terminal domains and link protein 总被引：7，自引：0，他引：7

S J Perkins A S Nealis J Dudhia T E Hardingham 《Journal of molecular biology》1989,206(4):737-753

Detailed primary sequence and secondary structure analyses are reported for the hyaluronate binding region (G1 domain) and link protein of proteoglycan aggregates. These are based on six full or partial sequences from the chicken, pig, human, rat and bovine proteins. Determinations of a full pig and a partial human link protein sequence are reported in the Appendix. Five sequences at the N terminus in both proteins were compared with the structures of 11 variable immunoglobulin (Ig) fold domains for which crystal structures are available. Despite only modest sequence homology, a clear alignment could be proposed. Analysis of this shows that the equivalents of the first and second hypervariable segments are now significantly longer, and both proteins have N-terminal extensions that are up to 23 residues in length. Secondary structure predictions showed that these sequences could be identified with available crystal structures for the variable Ig fold. However the hydrophobic residues involved in interactions between the light and heavy chains in Igs are replaced by hydrophilic charged groups in both proteins. These results imply that both proteins are members of the Ig superfamily, but exhibit structural differences distinct from other members of this superfamily for which crystal structures are known. The proteoglycan tandem repeat (PTR) is a repeat of 99 residues that is found twice in the amino acid sequence of link protein and the proteoglycan G1 domain adjacent to the Ig fold, and also twice in the proteoglycan G2 domain. A total of 16 PTRs was available for analysis. Compositional analyses show that these are positively charged if these originate from link protein, and negatively charged if from the G1 or G2 domains. The 16 Robson secondary structure predictions for the PTRs were averaged to improve the statistics of the prediction, and checked by comparison with Chou-Fasman calculations. A strong alpha-helix prediction was found at residues 13 to 25, and several beta-strands were predicted. The overall content is 18% alpha-helix and 28% beta-sheet, with 44% of the remaining sequence being predicted as turns. These analyses show that both the proteoglycan G1 domain and link protein are constructed from two distinct globular components, which may provide the two functional roles of these proteins in proteoglycan aggregation. 相似文献

2.

Conserved patterns and interactions in the unfolding transition state across SH3 domain structural homologues

Cullen Demakis Matthew C. Childers Valerie Daggett 《Protein science : a publication of the Protein Society》2021,30(2):391

Proteins with similar structures are generally assumed to arise from similar sequences. However, there are more cases than not where this is not true. The dogma is that sequence determines structure; how, then, can very different sequences fold to the same structure? Here, we employ high temperature unfolding simulations to probe the pathways and specific interactions that direct the folding and unfolding of the SH3 domain. The SH3 metafold in the Dynameomics Database consists of 753 proteins with the same structure, but varied sequences and functions. To investigate the relationship between sequence and structure, we selected 17 targets from the SH3 metafold with high sequence variability. Six unfolding simulations were performed for each target, transition states were identified, revealing two general folding/unfolding pathways at the transition state. Transition states were also expressed as mathematical graphs of connected chemical nodes, and it was found that three positions within the structure, independent of sequence, were consistently more connected within the graph than any other nearby positions in the sequence. These positions represent a hub connecting different portions of the structure. Multiple sequence alignment and covariation analyses also revealed certain positions that were more conserved due to packing constraints and stabilizing long‐range contacts. This study demonstrates that members of the SH3 domain with different sequences can unfold through two main pathways, but certain characteristics are conserved regardless of the sequence or unfolding pathway. While sequence determines structure, we show that disparate sequences can provide similar interactions that influence folding and lead to similar structures. 相似文献

3.

Automated search of natively folded protein fragments for high-throughput structure determination in structural genomics

下载免费PDF全文

Kuroda Y Tani K Matsuo Y Yokoyama S 《Protein science : a publication of the Protein Society》2000,9(12):2313-2321

Structural genomic projects envision almost routine protein structure determinations, which are currently imaginable only for small proteins with molecular weights below 25,000 Da. For larger proteins, structural insight can be obtained by breaking them into small segments of amino acid sequences that can fold into native structures, even when isolated from the rest of the protein. Such segments are autonomously folding units (AFU) and have sizes suitable for fast structural analyses. Here, we propose to expand an intuitive procedure often employed for identifying biologically important domains to an automatic method for detecting putative folded protein fragments. The procedure is based on the recognition that large proteins can be regarded as a combination of independent domains conserved among diverse organisms. We thus have developed a program that reorganizes the output of BLAST searches and detects regions with a large number of similar sequences. To automate the detection process, it is reduced to a simple geometrical problem of recognizing rectangular shaped elevations in a graph that plots the number of similar sequences at each residue of a query sequence. We used our program to quantitatively corroborate the premise that segments with conserved sequences correspond to domains that fold into native structures. We applied our program to a test data set composed of 99 amino acid sequences containing 150 segments with structures listed in the Protein Data Bank, and thus known to fold into native structures. Overall, the fragments identified by our program have an almost 50% probability of forming a native structure, and comparable results are observed with sequences containing domain linkers classified in SCOP. Furthermore, we verified that our program identifies AFU in libraries from various organisms, and we found a significant number of AFU candidates for structural analysis, covering an estimated 5 to 20% of the genomic databases. Altogether, these results argue that methods based on sequence similarity can be useful for dissecting large proteins into small autonomously folding domains, and such methods may provide an efficient support to structural genomics projects. 相似文献

4.

Hidden Markov models that use predicted secondary structures for fold recognition. 总被引：6，自引：0，他引：6

J Hargbo A Elofsson 《Proteins》1999,36(1):68-76

There are many proteins that share the same fold but have no clear sequence similarity. To predict the structure of these proteins, so called "protein fold recognition methods" have been developed. During the last few years, improvements of protein fold recognition methods have been achieved through the use of predicted secondary structures (Rice and Eisenberg, J Mol Biol 1997;267:1026-1038), as well as by using multiple sequence alignments in the form of hidden Markov models (HMM) (Karplus et al., Proteins Suppl 1997;1:134-139). To test the performance of different fold recognition methods, we have developed a rigorous benchmark where representatives for all proteins of known structure are matched against each other. Using this benchmark, we have compared the performance of automatically-created hidden Markov models with standard-sequence-search methods. Further, we combine the use of predicted secondary structures and multiple sequence alignments into a combined method that performs better than methods that do not use this combination of information. Using only single sequences, the correct fold of a protein was detected for 10% of the test cases in our benchmark. Including multiple sequence information increased this number to 16%, and when predicted secondary structure information was included as well, the fold was correctly identified in 20% of the cases. Moreover, if the correct secondary structure was used, 27% of the proteins could be correctly matched to a fold. For comparison, blast2, fasta, and ssearch identifies the fold correctly in 13-17% of the cases. Thus, standard pairwise sequence search methods perform almost as well as hidden Markov models in our benchmark. This is probably because the automatically-created multiple sequence alignments used in this study do not contain enough diversity and because the current generation of hidden Markov models do not perform very well when built from a few sequences. 相似文献

5.

Identification of distant homologues of fibroblast growth factors suggests a common ancestor for all beta-trefoil proteins

Ponting CP Russell RB 《Journal of molecular biology》2000,302(5):1041-1047

Determination of the structures of fibroblast growth factors and interleukin-1s has previously revealed that they both adopt a beta-trefoil fold, similar to those found in Kunitz soybean trypsin inhibitors, ricin-like toxins, plant agglutinins and hisactophilin. These families possess distinct functions and occur in different subcellular localisations, and they appear to lack significant similarities in their sequences, ligands and modes of ligand binding. We have analysed the significance of sequence identities observed after structure alignment and provide statistical evidence that these beta-trefoil proteins are all homologues, having arisen from a common ancestor. In addition, we have explored the sequence space of all beta-trefoil proteins and have determined that the actin-binding proteins fascins, and other proteins of unknown function, are beta-trefoil family homologues. Unlike other beta-trefoil proteins, the triplicated repeats in each of the four beta-trefoil domains of fascins are significantly similar in sequence. This hints at how the beta-trefoil fold arose from the duplication of an ancestral gene encoding a homotrimeric single-repeat protein. The combined analysis of structure and sequence databases for detecting significant similarities is suggested as a highly sensitive approach to determining the common ancestry of extremely divergent homologues. 相似文献

6.

Protein fold recognition using sequence-derived predictions. 总被引：18，自引：9，他引：9

下载免费PDF全文

D. Fischer D. Eisenberg 《Protein science : a publication of the Protein Society》1996,5(5):947-955

In protein fold recognition, one assigns a probe amino acid sequence of unknown structure to one of a library of target 3D structures. Correct assignment depends on effective scoring of the probe sequence for its compatibility with each of the target structures. Here we show that, in addition to the amino acid sequence of the probe, sequence-derived properties of the probe sequence (such as the predicted secondary structure) are useful in fold assignment. The additional measure of compatibility between probe and target is the level of agreement between the predicted secondary structure of the probe and the known secondary structure of the target fold. That is, we recommend a sequence-structure compatibility function that combines previously developed compatibility functions (such as the 3D-1D scores of Bowie et al. [1991] or sequence-sequence replacement tables) with the predicted secondary structure of the probe sequence. The effect on fold assignment of adding predicted secondary structure is evaluated here by using a benchmark set of proteins (Fischer et al., 1996a). The 3D structures of the probe sequences of the benchmark are actually known, but are ignored by our method. The results show that the inclusion of the predicted secondary structure improves fold assignment by about 25%. The results also show that, if the true secondary structure of the probe were known, correct fold assignment would increase by an additional 8-32%. We conclude that incorporating sequence-derived predictions significantly improves assignment of sequences to known 3D folds. Finally, we apply the new method to assign folds to sequences in the SWISSPROT database; six fold assignments are given that are not detectable by standard sequence-sequence comparison methods; for two of these, the fold is known from X-ray crystallography and the fold assignment is correct. 相似文献

7.

Evolutionary Dynamics on Protein Bi-stability Landscapes can Potentially Resolve Adaptive Conflicts

Tobias Sikosek Erich Bornberg-Bauer Hue Sun Chan 《PLoS computational biology》2012,8(9)

Experimental studies have shown that some proteins exist in two alternative native-state conformations. It has been proposed that such bi-stable proteins can potentially function as evolutionary bridges at the interface between two neutral networks of protein sequences that fold uniquely into the two different native conformations. Under adaptive conflict scenarios, bi-stable proteins may be of particular advantage if they simultaneously provide two beneficial biological functions. However, computational models that simulate protein structure evolution do not yet recognize the importance of bi-stability. Here we use a biophysical model to analyze sequence space to identify bi-stable or multi-stable proteins with two or more equally stable native-state structures. The inclusion of such proteins enhances phenotype connectivity between neutral networks in sequence space. Consideration of the sequence space neighborhood of bridge proteins revealed that bi-stability decreases gradually with each mutation that takes the sequence further away from an exactly bi-stable protein. With relaxed selection pressures, we found that bi-stable proteins in our model are highly successful under simulated adaptive conflict. Inspired by these model predictions, we developed a method to identify real proteins in the PDB with bridge-like properties, and have verified a clear bi-stability gradient for a series of mutants studied by Alexander et al. (Proc Nat Acad Sci USA 2009, 106:21149–21154) that connect two sequences that fold uniquely into two different native structures via a bridge-like intermediate mutant sequence. Based on these findings, new testable predictions for future studies on protein bi-stability and evolution are discussed. 相似文献

8.

Solution NMR structures of IgG binding domains with artificially evolved high levels of sequence identity but different folds

He Y Yeh DC Alexander P Bryan PN Orban J 《Biochemistry》2005,44(43):14055-14061

We describe here the solution NMR structures of two IgG binding domains with highly homologous sequences but different three-dimensional structures. The proteins, G311 and A219, are derived from the IgG binding domains of their wild-type counterparts, protein G and protein A, respectively. Through a series of site-directed mutations and phage display selections, the sequences of G311 and A219 were designed to converge to a point of high-level sequence identity while keeping their respective wild-type tertiary folds. Structures of both artificially evolved sequences were determined by NMR spectroscopy. The main chain fold of G311 can be superimposed on the wild-type alpha/beta protein G structure with a backbone rmsd of 1.4 A, and the A219 structure can be overlaid on the wild-type three-alpha-helix protein A fold also with a backbone rmsd of 1.4 A. The structure of G311, in particular, accommodates a large number of mutational changes without undergoing a change in the overall fold of the main chain. The structural differences are maintained despite a high level (59%) of sequence identity. These proteins serve as starting points for further experiments that will probe basic concepts of protein folding and conformational switching. 相似文献

9.

Developing optimal non-linear scoring function for protein design

Hu C Li X Liang J 《Bioinformatics (Oxford, England)》2004,20(17):3080-3098

Motivation. Protein design aims to identify sequences compatible with a given protein fold but incompatible to any alternative folds. To select the correct sequences and to guide the search process, a design scoring function is critically important. Such a scoring function should be able to characterize the global fitness landscape of many proteins simultaneously. RESULTS: To find optimal design scoring functions, we introduce two geometric views and propose a formulation using a mixture of non-linear Gaussian kernel functions. We aim to solve a simplified protein sequence design problem. Our goal is to distinguish each native sequence for a major portion of representative protein structures from a large number of alternative decoy sequences, each a fragment from proteins of different folds. Our scoring function discriminates perfectly a set of 440 native proteins from 14 million sequence decoys. We show that no linear scoring function can succeed in this task. In a blind test of unrelated proteins, our scoring function misclassfies only 13 native proteins out of 194. This compares favorably with about three-four times more misclassifications when optimal linear functions reported in the literature are used. We also discuss how to develop protein folding scoring function. 相似文献

10.

Theoretical model of restriction endonuclease HpaI in complex with DNA, predicted by fold recognition and validated by site-directed mutagenesis

Skowronek KJ Kosinski J Bujnicki JM 《Proteins》2006,63(4):1059-1068

Type II restriction enzymes are commercially important deoxyribonucleases and very attractive targets for protein engineering of new specificities. At the same time they are a very challenging test bed for protein structure prediction methods. Typically, enzymes that recognize different sequences show little or no amino acid sequence similarity to each other and to other proteins. Based on crystallographic analyses that revealed the same PD-(D/E)XK fold for more than a dozen case studies, they were nevertheless considered to be related until the combination of bioinformatics and mutational analyses has demonstrated that some of these proteins belong to other, unrelated folds PLD, HNH, and GIY-YIG. As a part of a large-scale project aiming at identification of a three-dimensional fold for all type II REases with known sequences (currently approximately 1000 proteins), we carried out preliminary structure prediction and selected candidates for experimental validation. Here, we present the analysis of HpaI REase, an ORFan with no detectable homologs, for which we detected a structural template by protein fold recognition, constructed a model using the FRankenstein monster approach and identified a number of residues important for the DNA binding and catalysis. These predictions were confirmed by site-directed mutagenesis and in vitro analysis of the mutant proteins. The experimentally validated model of HpaI will serve as a low-resolution structural platform for evolutionary considerations in the subgroup of blunt-cutting REases with different specificities. The research protocol developed in the course of this work represents a streamlined version of the previously used techniques and can be used in a high-throughput fashion to build and validate models for other enzymes, especially ORFans that exhibit no sequence similarity to any other protein in the database. 相似文献

11.

Folding mechanisms of proteins with high sequence identity but different folds

Scott KA Daggett V 《Biochemistry》2007,46(6):1545-1556

The problem of how a protein folds from a linear chain of amino acids to the three-dimensional structure necessary for function is often investigated using proteins with a low degree of sequence identity that adopt different folds. The design of pairs of proteins with a high degree of sequence identity but different folds offers the opportunity for a complementary study; in two highly similar sequences, which residues are the most important in directing folding to a particular structure? Here we use molecular dynamics simulations to characterize the folding-unfolding pathways of a pair of proteins designed by Bryan and co-workers [Alexander, P. A., et al. (2005) Biochemistry 44, 14045-14054; He, Y. N., et al. (2005) Biochemistry 44, 14055-14061]. Despite being 59% identical, the two protein sequences fold to two different structures. The first sequence folds to the alpha+beta protein G structure and the second to the all-alpha-helical protein A structure. We show that the final protein structure is determined early along the folding pathway. In folding to the protein G structure, the single alpha-helix (alpha1) and the beta3-beta4 turn fold early. Formation of the hairpin turn essentially prevents folding to helical structure in this region of the protein. This early structure is then consolidated by formation of long-range hydrophobic interactions between alpha1 and the beta3-beta4 turn. The protein A sequence differs both in the residues that form the beta3-beta4 turn and also in many of the residues that form the early hydrophobic interactions in the protein G structure. Instead, in the protein A sequence, a more hierarchical mechanism is observed, with helices folding before many of the tertiary interactions are formed. We find that small, but critical, sequence differences determine the topology of the protein early along the folding pathway, which help to explain the process by which one fold can evolve into another. 相似文献

12.

beta-Trefoil fold. Patterns of structure and sequence in the Kunitz inhibitors interleukins-1 beta and 1 alpha and fibroblast growth factors.

A G Murzin A M Lesk C Chothia 《Journal of molecular biology》1992,223(2):531-543

Previous crystallographic analyses of the Kunitz inhibitors from soybean. Erythrina caffra and wheat, the interleukins-1 beta and 1 alpha and the acidic and basic fibroblast growth factors have shown that they contain a most unusual fold. It is formed by six two-stranded hairpins. Three of these form a barrel structure and the other three are in a triangular array that caps the barrel. The arrangement of the secondary structures gives the molecules a pseudo 3-fold axis. Although the different proteins have very similar structures, many of their sequences have no significant similarities overall. The structural determinants of this fold are described and discussed in this paper. The barrels in the different proteins have the same geometrical features: six strands tilted at 56 degrees to the barrel axis; a barrel diameter of 16 A, and the beta-sheet hydrogen bonded so that it is staggered with a shear number of 12. These features fit McLachlan's equations for ideal barrels formed by beta-sheets. The wide diameter of the barrels is filled by layers of residues that, while not identical in the different proteins, are, in almost all cases, large. The structure of the triangular array of hairpins is determined by the coiling of the strands and the packing of hairpin residues against each other and against residues from the interior of the barrel. The major sequence requirements of this fold are large or medium hydrophobic residues at 18 buried sites. In the different structures the total volume of these residues is 3000 (+/- 120) A. The polyhedron model of protein architecture is used to demonstrate that the main, and in particular the symmetrical, features of this fold arise from the ideal and equal packing of six hairpins, modified only slightly to form hydrogen bonds between the hairpins. 相似文献

13.

Using a Hydrophobic Contact Potential to Evaluate Native and Near-native Folds Generated by Molecular Dynamics Simulations

《Journal of molecular biology》1996,257(3):716-725

There are several knowledge-based energy functions that can distinguish the native fold from a pool of grossly misfolded decoys for a given sequence of amino acids. These decoys, which are typically generated by mounting, or “threading”, the sequence onto the backbones of unrelated protein structures, tend to be non-compact and quite different from the native structure: the root-mean-squared (RMS) deviations from the native are commonly in the range of 15 to 20 Å. Effective energy functions should also demonstrate a similar recognition capability when presented with compact decoys that depart only slightly in conformation from the correct structure (i.e. those with RMS deviations of ∼5 Å or less). Recently, we developed a simple yet powerful method for native fold recognition based on the tendency for native folds to form hydrophobic cores. Our energy measure, which we call the hydrophobic fitness score, is challenged to recognize the native fold from 2000 near-native structures generated for each of five small monomeric proteins. First, 1000 conformations for each protein were generated by molecular dynamics simulation at room temperature. The average RMS deviation of this set of 5000 was 1.5 Å. A total of 323 decoys had energies lower than native; however, none of these had RMS deviations greater than 2 Å. Another 1000 structures were generated for each at high temperature, in which a greater range of conformational space was explored (4.3 Å average RMS deviation). Out of this set, only seven decoys were misrecognized. The hydrophobic fitness energy of a conformation is strongly dependent upon the RMS deviation. On average our potential yields energy values which are lowest for the population of structures generated at room temperature, intermediate for those produced at high temperature and highest for those constructed by threading methods. In general, the lowest energy decoy conformations have backbones very close to native structure. The possible utility of our method for screening backbone candidates for the purpose of modelling by side-chain packing optimization is discussed. 相似文献

14.

An integrated approach to the analysis and modeling of protein sequences and structures. III. A comparative study of sequence conservation in protein structural families using multiple structural alignments

Yang AS Honig B 《Journal of molecular biology》2000,301(3):691-711

The information required to generate a protein structure is contained in its amino acid sequence, but how three-dimensional information is mapped onto a linear sequence is still incompletely understood. Multiple structure alignments of similar protein structures have been used to investigate conserved sequence features but contradictory results have been obtained, due, in large part, to the absence of subjective criteria to be used in the construction of sequence profiles and in the quantitative comparison of alignment results. Here, we report a new procedure for multiple structure alignment and use it to construct structure-based sequence profiles for similar proteins. The definition of "similar" is based on the structural alignment procedure and on the protein structural distance (PSD) described in paper I of this series, which offers an objective measure for protein structure relationships. Our approach is tested in two well-studied groups of proteins; serine proteases and Ig-like proteins. It is demonstrated that the quality of a sequence profile generated by a multiple structure alignment is quite sensitive to the PSD used as a threshold for the inclusion of proteins in the alignment. Specifically, if the proteins included in the aligned set are too distant in structure from one another, there will be a dilution of information and patterns that are relevant to a subset of the proteins are likely to be lost.In order to understand better how the same three-dimensional information can be encoded in seemingly unrelated sequences, structure-based sequence profiles are constructed for subsets of proteins belonging to nine superfolds. We identify patterns of relatively conserved residues in each subset of proteins. It is demonstrated that the most conserved residues are generally located in the regions where tertiary interactions occur and that are relatively conserved in structure. Nevertheless, the conservation patterns are relatively weak in all cases studied, indicating that structure-determining factors that do not require a particular sequential arrangement of amino acids, such as secondary structure propensities and hydrophobic interactions, are important in encoding protein fold information. In general, we find that similar structures can fold without having a set of highly conserved residue clusters or a well-conserved sequence profile; indeed, in some cases there is no apparent conservation pattern common to structures with the same fold. Thus, when a group of proteins exhibits a common and well-defined sequence pattern, it is more likely that these sequences have a close evolutionary relationship rather than the similarities having arisen from the structural requirements of a given fold. 相似文献

15.

Identifying sequence-structure pairs undetected by sequence alignments

Miyazawa S Jernigan RL 《Protein engineering》2000,13(7):459-475

We examine how effectively simple potential functions previously developed can identify compatibilities between sequences and structures of proteins for database searches. The potential function consists of pairwise contact energies, repulsive packing potentials of residues for overly dense arrangement and short-range potentials for secondary structures, all of which were estimated from statistical preferences observed in known protein structures. Each potential energy term was modified to represent compatibilities between sequences and structures for globular proteins. Pairwise contact interactions in a sequence-structure alignment are evaluated in a mean field approximation on the basis of probabilities of site pairs to be aligned. Gap penalties are assumed to be proportional to the number of contacts at each residue position, and as a result gaps will be more frequently placed on protein surfaces than in cores. In addition to minimum energy alignments, we use probability alignments made by successively aligning site pairs in order by pairwise alignment probabilities. The results show that the present energy function and alignment method can detect well both folds compatible with a given sequence and, inversely, sequences compatible with a given fold, and yield mostly similar alignments for these two types of sequence and structure pairs. Probability alignments consisting of most reliable site pairs only can yield extremely small root mean square deviations, and including less reliable pairs increases the deviations. Also, it is observed that secondary structure potentials are usefully complementary to yield improved alignments with this method. Remarkably, by this method some individual sequence-structure pairs are detected having only 5-20% sequence identity. 相似文献

16.

Assessing the reliability of sequence similarities detected through hydrophobic cluster analysis

Silva PJ 《Proteins》2008,70(4):1588-1594

Hydrophobic cluster analysis (HCA) has long been used as a tool to detect distant homologies between protein sequences, and to classify them into different folds. However, it relies on expert human intervention, and is sensitive to subjective interpretations of pattern similarities. In this study, we describe a novel algorithm to assess the similarity of hydrophobic amino acid distributions between two sequences. Our algorithm correctly identifies as misattributions several HCA-based proposals of structural similarity between unrelated proteins present in the literature. We have also used this method to identify the proper fold of a large variety of sequences, and to automatically select the most appropriate structure for homology modeling of several proteins with low sequence identity to any other member of the protein data bank. Automatic modeling of the target proteins based on these templates yielded structures with TM-scores (vs. experimental structures) above 0.60, even without further refinement. Besides enabling a reliable identification of the correct fold of an unknown sequence and the choice of suitable templates, our algorithm also shows that whereas most structural classes of proteins are very homogeneous in hydrophobic cluster composition, a tenth of the described families are compatible with a large variety of hydrophobic patterns. We have built a browsable database of every major representative hydrophobic cluster pattern present in each structural class of proteins, freely available at http://www2.ufp.pt/ pedros/HCA_db/index.htm. 相似文献

17.

Secondary structure switching in Cro protein evolution

Newlove T Konieczka JH Cordes MH 《Structure (London, England : 1993)》2004,12(4):569-581

We report the solution structure of the Cro protein from bacteriophage P22. Comparisons of its sequence and structure to those of lambda Cro strongly suggest an alpha-to-beta secondary structure switching event during Cro evolution. The folds of P22 Cro and lambda Cro share a three alpha helix fragment comprising the N-terminal half of the domain. However, P22 Cro's C terminus folds as two helices, while lambda Cro's folds as a beta hairpin. The all-alpha fold found for P22 Cro appears to be ancestral, since it also occurs in cI proteins, which are anciently duplicated paralogues of Cro. PSI-BLAST and transitive homology analyses strongly suggest that the sequences of P22 Cro and lambda Cro are globally homologous despite encoding different folds. The alpha+beta fold of lambda Cro therefore likely evolved from its all-alpha ancestor by homologous secondary structure switching, rather than by nonhomologous replacement of both sequence and structure. 相似文献

18.

Rothery RA Kalra N Turner RJ Weiner JH 《Journal of molecular microbiology and biotechnology》2002,4(2):133-150

Integral membrane proteins usually have a predominantly alpha-helical secondary structure in which transmembrane segments are connected by membrane-extrinsic loops. Although a number of membrane protein structures have been reported in recent years, in most cases transmembrane topologies are initially predicted using a variety of theoretical techniques, including hydropathy analyses and the "positive inside" rule. We have explored the use of plots of the distribution of sequence similarity within families of membrane proteins comprising homeomorphic domains as a new method for the prediction/verification of the orientation of transmembrane topology models within certain families of multimeric respiratory chain enzymes. Within such proteins, analyses of sequence similarity can: i) identify heme and/or quinol binding sites; ii) identify potential electron-transfer conduits to/from prosthetic groups; and iii) locate regions defining potential subunit-subunit interactions. We mined emerging bioinformatic data for sequences of 11 families of membrane-intrinsic proteins that are part of multimeric respiratory chain complexes that also have membrane-extrinsic subunits. The sequences of each family were then aligned and the resultant alignments converted into a graphical format recording an empirical measure of the sequence similarity plotted versus residue position. In each case, this plot was compared to the predicted transmembrane topology. With one exception, there is a strong correlation between the existence 相似文献

19.

Protein structure prediction: making AWSEM AWSEM‐ER by adding evolutionary restraints

下载免费PDF全文

Brian J. Sirovetz Nicholas P. Schafer Peter G. Wolynes 《Proteins》2017,85(11):2127-2142

Protein sequences have evolved to fold into functional structures, resulting in families of diverse protein sequences that all share the same overall fold. One can harness protein family sequence data to infer likely contacts between pairs of residues. In the current study, we combine this kind of inference from coevolutionary information with a coarse‐grained protein force field ordinarily used with single sequence input, the Associative memory, Water mediated, Structure and Energy Model (AWSEM), to achieve improved structure prediction. The resulting Associative memory, Water mediated, Structure and Energy Model with Evolutionary Restraints (AWSEM‐ER) yields a significant improvement in the quality of protein structure prediction over the single sequence prediction from AWSEM when a sufficiently large number of homologous sequences are available. Free energy landscape analysis shows that the addition of the evolutionary term shifts the free energy minimum to more native‐like structures, which explains the improvement in the quality of structures when performing predictions using simulated annealing. Simulations using AWSEM without coevolutionary information have proved useful in elucidating not only protein folding behavior, but also mechanisms of protein function. The success of AWSEM‐ER in de novo structure prediction suggests that the enhanced model opens the door to functional studies of proteins even when no experimentally solved structures are available. 相似文献

20.

Probabilistic description of protein alignments for sequences and structures

Koike R Kinoshita K Kidera A 《Proteins》2004,56(1):157-166

相似文献