首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The ability to predict and characterize distributions of reactivities over families and even superfamilies of proteins opens the door to an array of analyses regarding functional evolution. In this article, insights into functional evolution in the Kazal inhibitor superfamily are gained by analyzing and comparing predicted association free energy distributions against six serine proteinases, over a number of groups of inhibitors: all possible Kazal inhibitors, natural avian ovomucoid first and third domains, and sets of Kazal inhibitors with statistically weighted combinations of residues. The results indicate that, despite the great hypervariability of residues in the 10 proteinase-binding positions, avian ovomucoid third domains evolved to inhibit enzymes similar to the six enzymes selected, whereas the orthologous first domains are not inhibitors of these enzymes on purpose. Hypervariability arises because of similarity in energetic contribution from multiple residue types; conservation is in terms of functionality, with "good" residues, which make positive or less deleterious contributions to the binding, selected more frequently, and yielding overall the same distributional characteristics. Further analysis of the distributions indicates that while nature did optimize inhibitor strength, the objective may not have been the strongest possible inhibitor against one enzyme but rather an inhibitor that is relatively strong against a number of enzymes.  相似文献   

2.
Joseph TT  Osman R 《Proteins》2012,80(5):1283-1298
Silencing in RNAi is strongly affected by guide‐strand/target‐mRNA mismatches. Target nucleation is thought to occur at positions 2–8 of the guide (“seed region”); successful hybridization in this region is the primary determinant of target‐binding affinity and hence target cleavage. To define a molecular basis for the target sequence selectivity in RNAi, we studied all possible distinct single mismatches in seven positions of the seed region—a total of 21 substitutions. We report results from soft‐core thermodynamic integration simulations to determine changes in targeting binding‐free energies to Argonaute due to single mismatches in the guide strand, which arise during binding of an imperfectly matched target mRNA. In agreement with experiment, most mismatches impair target binding, consistent with a prominent role for binding affinity changes in RNAi sequence selectivity. Individual Argonaute residues located near the mismatched base pair are found to contribute significantly to binding affinity changes. We also use this methodology to analyze the mismatch‐dependent free energy changes for dissociation of a DNA?RNA hybrid from Argonaute, as a model for the escape of miRNAs from the silencing pathway. Several mismatched sequences of the miRNA have increased affinity to Argonaute, implying that some mismatches may reduce the probability for escape. Furthermore, calculations of base‐substitution‐dependent free energy changes for binding ssDNA reveal mild sequence sensitivity as expected for guide strand binding to Argonaute. Our findings give a thermodynamic basis for RNAi target sequence selectivity and suggest that miRNA mismatches may increase silencing effectiveness and thus could be evolutionarily advantageous. Proteins 2012; © 2011 Wiley Periodicals, Inc.  相似文献   

3.
For many protein families, such as serine proteinases or serine proteinase inhibitors, the family assignment predicts reactivity only in general terms. Both detailed specificity and quantitative reactivity are lacking. We believe that, for many such protein families, algorithms can be devised by defining the subset of n functionally important sequence positions, making the 19n possible single mutants and measuring their reactivity. Given the assumption that the contributions of the n positions are additive, the reactivities of the 20(n) variants can be predicted. This is illustrated by an almost complete algorithm for the Kazal family of protein inhibitors of serine proteinases.  相似文献   

4.
Despite significant successes in structure‐based computational protein design in recent years, protein design algorithms must be improved to increase the biological accuracy of new designs. Protein design algorithms search through an exponential number of protein conformations, protein ensembles, and amino acid sequences in an attempt to find globally optimal structures with a desired biological function. To improve the biological accuracy of protein designs, it is necessary to increase both the amount of protein flexibility allowed during the search and the overall size of the design, while guaranteeing that the lowest‐energy structures and sequences are found. DEE/A*‐based algorithms are the most prevalent provable algorithms in the field of protein design and can provably enumerate a gap‐free list of low‐energy protein conformations, which is necessary for ensemble‐based algorithms that predict protein binding. We present two classes of algorithmic improvements to the A* algorithm that greatly increase the efficiency of A*. First, we analyze the effect of ordering the expansion of mutable residue positions within the A* tree and present a dynamic residue ordering that reduces the number of A* nodes that must be visited during the search. Second, we propose new methods to improve the conformational bounds used to estimate the energies of partial conformations during the A* search. The residue ordering techniques and improved bounds can be combined for additional increases in A* efficiency. Our enhancements enable all A*‐based methods to more fully search protein conformation space, which will ultimately improve the accuracy of complex biomedically relevant designs. Proteins 2015; 83:1859–1877. © 2015 Wiley Periodicals, Inc.  相似文献   

5.
Zhang B  Wustman BA  Morse D  Evans JS 《Biopolymers》2002,63(6):358-369
The lustrin superfamily represents a unique group of biomineralization proteins localized between layered aragonite mineral plates (i.e., nacre layer) in mollusk shell. Recent atomic force microscopy (AFM) pulling studies have demonstrated that the lustrin‐containing organic nacre layer in the abalone, Haliotis rufescens, exhibits a typical sawtooth force‐extension curve with hysteretic recovery. This force extension behavior is reminiscent of reversible unfolding and refolding in elastomeric proteins such as titin and tenascin. Since secondary structure plays an important role in force‐induced protein unfolding and refolding, the question is, What secondary structure(s) exist within the major domains of Lustrin A? Using a model peptide (FPGKNVNCTSGE) representing the 12‐residue consensus sequence found near the N‐termini of the first eight cysteine‐rich domains (C‐domains) within the Lustrin A protein, we employed CD, NMR spectroscopy, and simulated annealing/minimization to determine the secondary structure preferences for this sequence. At pH 7.4, we find that the 12‐mer sequence adopts a loop conformation, consisting of a “bend” or “turn” involving residues G3–K4 and N7–C8–T9, with extended conformations arising at F1–G3; K4–V6; T9–S10–G11 in the sequence. Minor pH‐dependent conformational effects were noted for this peptide; however, there is no evidence for a salt‐bridge interaction between the K4 and E12 side chains. The presence of a loop conformation within the highly conserved —PG—, —NVNCT— sequence of C1–C8 domains may have important structural and mechanistic implications for the Lustrin A protein with regard to elastic behavior. © 2002 Wiley Periodicals, Inc. Biopolymers 63: 358–369, 2002  相似文献   

6.
A novel conotoxin named lt6c, an O‐superfamily conotoxin, was identified from the cDNA library of venom duct of Conus litteratus. The full‐length cDNA contains an open reading frame encoding a predicted 22‐residue signal peptide, a 22‐residue proregion and a mature peptide of 28 amino acids. The signal peptide sequence of lt6c is highly conserved in O‐superfamily conotoxins and the mature peptide consists of six cysteines arranged in the pattern of C? C? CC? C? C that is defined the O‐superfamily of conotoxins. The mature peptide fused with thioredoxin, 6‐His tag, and a Factor Xa cleavage site was successfully expressed in Escherichia coli. About 12 mg lt6c was purified from 1L culture. Under whole‐cell patch‐clamp mode, lt6c inhibited sodium currents on adult rat dorsal root ganglion neurons. Therefore, lt6c is a novel O‐superfamily conotoxin that is able to block sodium channels. Copyright © 2008 European Peptide Society and John Wiley & Sons, Ltd.  相似文献   

7.
Natural biodiversity undoubtedly inspires biocatalysis research and innovation. Biotransformations of interest also inspire the search for appropriate biocatalysts in nature. Indeed, natural genetic resources have been found to support the hydrolysis and synthesis of not only common but also unusual synthetic scaffolds. The emerging tool of metagenomics has the advantage of allowing straightforward identification of activity directly applicable as biocatalysis. However, new enzymes must not only have outstanding properties in terms of performance but also other properties superior to those of well-established commercial preparations in order to successfully replace the latter. Esterases (EST) and lipases (LIP) from the α/β-hydrolase fold superfamily are among the enzymes primarily used in biocatalysis. Accordingly, they have been extensively examined with metagenomics. Here we provided an updated (October 2015) overview of sequence and functional data sets of 288 EST–LIP enzymes with validated functions that have been isolated in metagenomes and (mostly partially) characterized. Through sequence, biochemical, and reactivity analyses, we attempted to understand the phenomenon of variability and versatility within this group of enzymes and to implement this knowledge to identify sequences encoding EST–LIP which may be useful for biocatalysis. We found that the diversity of described EST–LIP polypeptides was not dominated by a particular type of protein or highly similar clusters of proteins but rather by diverse nonredundant sequences. Purified EST–LIP exhibited a wide temperature activity range of 10–85?°C, although a preferred bias for a mesophilic temperature range (35–40?°C) was observed. At least 60% of the total characterized metagenomics-derived EST–LIP showed outstanding properties in terms of stability (solvent tolerance) and reactivity (selectivity and substrate profile), which are the features of interest in biocatalysis. We hope that, in the future, the search for and utilization of sequences similar to those already encoded and characterized EST–LIP enzymes from metagenomes may be of interest for promoting unresolved biotransformations in the chemical industry. Some examples are discussed in this review.  相似文献   

8.
Patrick Slama 《Proteins》2018,86(1):3-12
Residues at different positions of a multiple sequence alignment sometimes evolve together, due to a correlated structural or functional stress at these positions. Co‐evolution has thus been evidenced computationally in multiple proteins or protein domains. Here, we wish to study whether an evolutionary stress is exerted on a sequence alignment across protein domains, i.e., on longer sequence separations than within a single protein domain. JmjC‐containing lysine demethylases were chosen for analysis, as a follow‐up to previous studies; these proteins are important multidomain epigenetic regulators. In these proteins, the JmjC domain is responsible for the demethylase activity, and surrounding domains interact with histones, DNA or partner proteins. This family of enzymes was analyzed at the sequence level, in order to determine whether the sequence of JmjC‐domains was affected by the presence of a neighboring JmjN domain or PHD finger in the protein. Multiple positions within JmjC sequences were shown to have their residue distributions significantly altered by the presence of the second domain. Structural considerations confirmed the relevance of the analysis for JmjN‐JmjC proteins, while among PHD‐JmjC proteins, the length of the linker region could be correlated to the residues observed at the most affected positions. The correlation of domain architecture with residue types at certain positions, as well as that of overall architecture with protein function, is discussed. The present results thus evidence the existence of an across‐domain evolutionary stress in JmjC‐containing demethylases, and provide further insights into the overall domain architecture of JmjC domain‐containing proteins.  相似文献   

9.
Ovomucoids consist of a single polypeptide chain which is composed of three tandem Kazal domains. Each Kazal domain is an actual or putative protein inhibitor of serine proteinases. Ovomucoid third domains were already isolated and sequenced from 126 species of birds (Laskowskiet al., 1987, 1990). This paper adds 27 new species. A number of generalizations are made on the basis of sequences from 153 species. The residues that are in contact with the enzyme in enzyme-inhibitor complexes are strikingly hypervariable. While the primary specificity residue,P 1, is the most variable; substitutions occur predominantly among aliphatic, hydrophobic residues. Consensus sequences for an avian ovomucoid third domain, for a b-type Kazal domain (i.e., a COOH terminal domain of multidomain inhibitors) and for a general Kazal domain are given. Finally, the individual new sequences are briefly discussed.  相似文献   

10.
Residues that are crucial to protein function or structure are usually evolutionarily conserved. To identify the important residues in protein, sequence conservation is estimated, and current methods rely upon the unbiased collection of homologous sequences. Surprisingly, our previous studies have shown that the sequence conservation is closely correlated with the weighted contact number (WCN), a measure of packing density for residue's structural environment, calculated only based on the Cα positions of a protein structure. Moreover, studies have shown that sequence conservation is correlated with environment‐related structural properties calculated based on different protein substructures, such as a protein's all atoms, backbone atoms, side‐chain atoms, or side‐chain centroid. To know whether the Cα atomic positions are adequate to show the relationship between residue environment and sequence conservation or not, here we compared Cα atoms with other substructures in their contributions to the sequence conservation. Our results show that Cα positions are substantially equivalent to the other substructures in calculations of various measures of residue environment. As a result, the overlapping contributions between Cα atoms and the other substructures are high, yielding similar structure–conservation relationship. Take the WCN as an example, the average overlapping contribution to sequence conservation is 87% between Cα and all‐atom substructures. These results indicate that only Cα atoms of a protein structure could reflect sequence conservation at the residue level.  相似文献   

11.
In this paper, we present numerical evidence that supports the notion of minimization in the sequence space of proteins for a target conformation. We use the conformations of the real proteins in the Protein Data Bank (PDB) and present computationally efficient methods to identify the sequences with minimum energy. We use edge-weighted connectivity graph for ranking the residue sites with reduced amino acid alphabet and then use continuous optimization to obtain the energy-minimizing sequences. Our methods enable the computation of a lower bound as well as a tight upper bound for the energy of a given conformation. We validate our results by using three different inter-residue energy matrices for five proteins from protein data bank (PDB), and by comparing our energy-minimizing sequences with 80 million diverse sequences that are generated based on different considerations in each case. When we submitted some of our chosen energy-minimizing sequences to Basic Local Alignment Search Tool (BLAST), we obtained some sequences from non-redundant protein sequence database that are similar to ours with an E-value of the order of 10-7. In summary, we conclude that proteins show a trend towards minimizing energy in the sequence space but do not seem to adopt the global energy-minimizing sequence. The reason for this could be either that the existing energy matrices are not able to accurately represent the inter-residue interactions in the context of the protein environment or that Nature does not push the optimization in the sequence space, once it is able to perform the function.  相似文献   

12.
Ozer N  Haliloglu T  Schiffer CA 《Proteins》2006,64(2):444-456
Drug resistance in HIV-1 protease can also occasionally confer a change in the substrate specificity. Through the use of computational techniques, a relationship can be determined between the substrate sequence and three-dimensional structure of HIV-1 protease, and be utilized to predict substrate specificity. In this study, we introduce a biased sequence search threading (BSST) methodology to analyze the preferences of substrate positions and correlations between them that might also identify which positions within known substrates can likely tolerate sequence variability and which cannot. The potential sequence space was efficiently explored using a low-resolution knowledge-based scoring function. The low-energy substrate sequences generated by the biased search are correlated with the natural substrates. Octameric sequences were predicted using the probabilities of residue positions in the sequences generated by BSST in three ways: considering each position in the substrate independently, considering pairwise interdependency, and considering triple-wise interdependency. The prediction of octameric sequences using the triple-wise conditional probabilities produces the most accurate results, reproducing most of the sequences for five of the nine natural substrates and implying that there is a complex interdependence between the different substrate residue positions. This likely reflects that HIV-1 protease recognizes the overall shape of the substrate more than its specific sequence.  相似文献   

13.
The “extended” type of short chain dehydrogenases/reductases (SDR), share a remarkable similarity in their tertiary structures inspite of being highly divergent in their functions and sequences. We have carried out principal component analysis (PCA) on structurally equivalent residue positions of 10 SDR families using information theoretic measures like Jensen–Shannon divergence and average shannon entropy as variables. The results classify residue positions in the SDR fold into six groups, one of which is characterized by low Shannon entropies but high Jensen–Shannon divergence against the reference family SDR1E, suggesting that these positions are responsible for the specific functional identities of individual SDR families, distinguishing them from the reference family SDR1E. Site directed mutagenesis of three residues from this group in the enzyme UDP‐Galactose 4‐epimerase belonging to SDR1E shows that the mutants promote the formation of NADH containing abortive complexes. Finally, molecular dynamics simulations have been used to suggest a mechanism by which the mutants interfere with the re‐oxidation of NADH leading to the formation of abortive complexes. Proteins 2014; 82:2842–2856. © 2014 Wiley Periodicals, Inc.  相似文献   

14.
T Suzuki  A Varshavsky 《The EMBO journal》1999,18(21):6017-6026
The N-degrons, a set of degradation signals recognized by the N-end rule pathway, comprise a protein's destabilizing N-terminal residue and an internal lysine residue. We show that the strength of an N-degron can be markedly increased, without loss of specificity, through the addition of lysine residues. A nearly exhaustive screen was carried out for N-degrons in the lysine (K)-asparagine (N) sequence space of the 14-residue peptides containing either K or N (16 384 different sequences). Of these sequences, 68 were found to function as N-degrons, and three of them were at least as active and specific as any of the previously known N-degrons. All 68 K/N-based N-degrons lacked the lysine at position 2, and all three of the strongest N-degrons contained lysines at positions 3 and 15. The results support a model of the targeting mechanism in which the binding of the E3-E2 complex to the substrate's destabilizing N-terminal residue is followed by a stochastic search for a sterically suitable lysine residue. Our strategy of screening a small library that encompasses the entire sequence space of two amino acids should be of use in many settings, including studies of protein targeting and folding.  相似文献   

15.
In multi‐domain proteins, the domains typically run end‐to‐end, that is, one domain follows the C‐terminus of another domain. However, approximately 10% of multi‐domain proteins are formed by insertion of one domain sequence into that of another domain. Detecting such insertions within protein sequences is a fundamental challenge in structural biology. The haloacid dehalogenase superfamily (HADSF) serves as a challenging model system wherein a variable cap domain (~5–200 residues in length) accessorizes the ubiquitous Rossmann‐fold core domain, with variations in insertion site and topology corresponding to different classes of cap types. Herein, we describe a comprehensive computational strategy, CapPredictor, for determining large, variable domain insertions in protein sequences. Using a novel sequence‐alignment algorithm in conjunction with a structure‐guided sequence profile from 154 core‐domain‐only structures, more than 40,000 HADSF member sequences were assigned cap types. The resulting data set afforded insight into HADSF evolution. Notably, a similar distribution of cap‐type classes across different phyla was observed, indicating that all cap types existed in the last universal common ancestor. In addition, comparative analyses of the predicted cap‐type and functional assignments showed that different cap types carry out similar chemistries. Thus, while cap domains play a role in substrate recognition and chemical reactivity, cap‐type does not strictly define functional class. Through this example, we have shown that CapPredictor is an effective new tool for the study of form and function in protein families where domain insertion occurs. Proteins 2014; 82:1896–1906. © 2014 Wiley Periodicals, Inc.  相似文献   

16.
Michael Laskowski Jr. (1930-2004) was a pioneer in the field of Standard mechanism serine proteinase inhibitors. He made numerous important contributions in the field. This article highlights some of his most important contributions such as the discovery of the reactive site in serine proteinase inhibitors, the proposal of the Standard mechanism of inhibition, and the sequence to reactivity algorithm for the Kazal family of inhibitors.  相似文献   

17.
Human plasma kallikrein (huPK) is a proteinase that participates in several biological processes. Although various inhibitors control its activity, members of the Kazal family have not been identified as huPK inhibitors. In order to map the enzyme active site, we synthesized peptides based on the reactive site (PRILSPV) of a natural Kazal-type inhibitor found in Cayman plasma, which is not an huPK inhibitor. As expected, the leader peptide (Abz-SAPRILSPVQ-EDDnp) was not cleaved by huPK. Modifications to the leader peptide at P'1, P'3 and P'4 positions were made according to the sequence of a phage display-generated recombinant Kazal inhibitor (PYTLKWV) that presented huPK-binding ability. Novel peptides were identified as substrates for huPK and related enzymes. Both porcine pancreatic and human plasma kallikreins cleaved peptides at Arg or Lys bonds, whereas human pancreatic kallikrein cleaved bonds involving Arg or a pair of hydrophobic amino acid residues. Peptide hydrolysis by pancreatic kallikrein was not significantly altered by amino acid replacements. The peptide Abz-SAPRILSWVQ-EDDnp was the best substrate and a competitive inhibitor for huPK, indicating that Trp residue at the P'4 position is important for enzyme action.  相似文献   

18.
Peter Palese 《Cell》1977,10(1):1-10
The 5′ terminal sequences of several adenovirus 2 (Ad2) mRNAs, isolated late in infection, are complementary to sequences within the Ad2 genome which are remote from the DNA from which the main coding sequence of each mRNA is transcribed. This has been observed by forming RNA displacement loops (R loops) between Ad2 DNA and unfractionated polysomal RNA from infected cells. The 5′ terminal sequences of mRNAs in R loops, variously located between positions 36 and 92, form complex secondary hybrids with single-stranded DNA from restriction endonuclease fragments containing sequences to the left of position 36 on the Ad2 genome. The structures visualized in the electron microscope show that short sequences coded at map positions 16.6, 19.6 and 26.6 on the R strand are joined to form a leader sequence of 150–200 nucleotides at the 5′ end of many late mRNAs. A late mRNA which maps to the left of position 16.6 shows a different pattern of second site hybridization. It contains sequences from 4.9?6.0 linked directly to those from 9.6?10.9. These findings imply a new mechanism for the biosynthesis of Ad2 mRNA in mammalian cells.  相似文献   

19.
A structural database of 11 families of chains differing by a single amino acid substitution has been built. Another structural dataset of 5 families with identical sequences has been used for comparison. The RMSD computed after a global superimposition of the mutated protein on each native one is smaller than the RMSD calculated among proteins of identical sequences. The effect of the perturbation is very local, and not necessarily the highest at the position of the mutation. A RMSD between mutated and native proteins is computed over a 3‐residue or a 7‐residue window at each position. To separate the effects of structural fluctuations due to point mutations from other sources, pair RMSD have been translated into P values which themselves are included in a score called P‐RANK. This score allows highlighting small backbone distortions by comparing these RMSD between mutated and native positions to the RMSD at the same positions in the absence of a mutation. It results from the P‐RANK that 38% of all mutations produce a significant effect on the displacement. When compared with a random distribution of RMSD at un‐mutated positions, we show that, even if the RMSD is greater when the mutation is in loops than in regular secondary structure, the relative effect is more important for regular secondary structures and for buried positions. We confirm the absence of correlation between RMSD and the predicted variation of free energy of folding but we found a small correlation between high RMSD and the error in the prediction of ΔΔG.  相似文献   

20.
A fully automatic procedure for predicting the amino acid sequences compatible with a given target structure is described. It is based on the CHARMM package, and uses an all atom force-field and rotamer libraries to describe and evaluate side-chain types and conformations. Sequences are ranked by a quantity akin to the free energy of folding, which incorporates hydration effects. Exact (Branch and Bound) and heuristic optimisation procedures are used to identifying highly scoring sequences from an astronomical number of possibilities. These sequences include the minimum free energy sequence, as well as all amino acid sequences whose free energy lies within a specified window from the minimum. Several applications of our procedure are illustrated. Prediction of side-chain conformations for a set of ten proteins yields results comparable to those of established side-chain placement programs. Applications to sequence optimisation comprise the re-design of the protein cores of c-Crk SH3 domain, the B1 domain of protein G and Ubiquitin, and of surface residues of the SH3 domain. In all calculations, no restrictions are imposed on the amino acid composition and identical parameter settings are used for core and surface residues. The best scoring sequences for the protein cores are virtually identical to wild-type. They feature no more than one to three mutations in a total of 11-16 variable positions. Tests suggest that this is due to the balance between various contributions in the force-field rather than to overwhelming influence from packing constraints. The effectiveness of our force-field is further supported by the sequence predictions for surface residues of the SH3 domain. More mutations are predicted than in the core, seemingly in order to optimise the network of complementary interactions between polar and charged groups. This appears to be an important energetic requirement in absence of the partner molecules with which the SH3 domain interacts, which were not included in the calculations. Finally, a detailed comparison between the sequences generated by the heuristic and exact optimisation algorithms, commends a note of caution concerning the efficiency of heuristic procedures in exploring sequence space.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号