首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
We describe the application of a method geared toward structural and surface comparison of proteins. The method is based on the Geometric Hashing Paradigm adapted from Computer Vision. It allows for comparison of any two sets of 3-D coordinates, such as protein backbones, protein core or protein surface motifs, and small molecules such as drugs. Here we apply our method to 4 types of comparisons between pairs of molecules: (1) comparison of the backbones of two protein domains; (2) search for a predefined 3-D Cα motif within the full backbone of a domain; and in particular, (3) comparison of the surfaces of two receptor proteins; and (4) comparison of the surface of a receptor to the surface of a ligand. These aspects complement each other and can contribute toward a better understandingof protein structure and biomolecular recognition. Searches for 3-D surface motifs can be carried out on either receptors or on ligands. The latter may result in the detection of pharmacophoric patterns. If the surfaces of the binding sites of either the receptors or of the ligands are relatively similar, surface superpositioning may aid significantly in the docking problem. Currently, only distance invariants are used in the matching, although additional geometric surface invariants are considered. The speed of our Geometric Hashing algorithm is encouraging, with a typical surface comparison taking only seconds or minutes of CPU time on a SUN 4 SPARC workstation. The direct application of this method to the docking problem is also discussed. We demonstrate the success of this methodin its application to two members of the globin family and to two dehydrogenases. © 1993 Wiley-Liss, Inc.  相似文献   

2.
This paper reports the isolation of cDNAs encoding the protein backbone of two arabinogalactan-proteins (AGPs), one from pear cell suspension cultures (AGP Pc 2) and the other from suspension cultures of Nicotiana alata (AGP Na 2). The proteins encoded by these cDNAs are quite different from the 'classical' AGP backbones described previously for AGPs isolated from pear suspension cultures and extracts of N. alata styles. The cDNA for AGP Pc 2 encodes a 294 amino acid protein, of which a relatively short stretch (35 amino acids) is Hyp/Pro rich; this stretch is flanked by sequences which are dominated by Asn residues. Asn residues are not a feature of the 'classical' AGP backbones in which Hyp/Pro, Ser, Ala and Thr account for most of the amino acids. The cDNA for AGP Na 2 encodes a 437 amino acid protein, which contains two distinct domains: one rich in Hyp/Pro, Ser, Ala, Thr and the other rich in Asn, Tyr and Ser. The composition and sequence of the Pro-rich domain resembles that of the 'classical' AGP backbone. The Asn-rich domains of the two cDNAs described have no sequence similarity; in both cases they are predicted to be processed to give a mature backbone with a composition similar to that of the 'classical' AGPs. The study shows that different AGPs can differ in the amino acid sequence in the protein backbone, as well as the composition and sequence of the arabinogalactan side-chains. It also shows that differential expression of genes encoding AGP protein backbones, as well as differential glycosylation, can contribute to the tissue specificity of AGPs.  相似文献   

3.
The problem of constructing all-atom model co-ordinates of a protein from an outline of the polypeptide chain is encountered in protein structure determination by crystallography or nuclear magnetic resonance spectroscopy, in model building by homology and in protein design. Here, we present an automatic procedure for generating full protein co-ordinates (backbone and, optionally, side-chains) given the C alpha trace and amino acid sequence. To construct backbones, a protein structure database is first scanned for fragments that locally fit the chain trace according to distance criteria. A best path algorithm then sifts through these segments and selects an optimal path with minimal mismatch at fragment joints. In blind tests, using fully known protein structures, backbones (C alpha, C, N, O) can be reconstructed with a reliability of 0.4 to 0.6 A root-mean-square position deviation and not more than 0 to 5% peptide flips. This accuracy is sufficient to identify possible errors in protein co-ordinate sets. To construct full co-ordinates, side-chains are added from a library of frequently occurring rotamers using a simple and fast Monte Carlo procedure with simulated annealing. In tests on X-ray structures determined at better than 2.5 A resolution, the positions of side-chain atoms in the protein core (less than 20% relative accessibility) have an accuracy of 1.6 A (r.m.s. deviation) and 70% of chi 1 angles are within 30 degrees of the X-ray structure. The computer program MaxSprout is available on request.  相似文献   

4.
Here we perform a systematic exploration of the use of distance constraints derived from small angle X-ray scattering (SAXS) measurements to filter candidate protein structures for the purpose of protein structure prediction. This is an intrinsically more complex task than that of applying distance constraints derived from NMR data where the identity of the pair of amino acid residues subject to a given distance constraint is known. SAXS, on the other hand, yields a histogram of pair distances (pair distribution function), but the identities of the pairs contributing to a given bin of the histogram are not known. Our study is based on an extension of the Levitt-Hinds coarse grained approach to ab initio protein structure prediction to generate a candidate set of C(alpha) backbones. In spite of the lack of specific residue information inherent in the SAXS data, our study shows that the implementation of a SAXS filter is capable of effectively purifying the set of native structure candidates and thus provides a substantial improvement in the reliability of protein structure prediction. We test the quality of our predicted C(alpha) backbones by doing structural homology searches against the Dali domain library, and find that the results are very encouraging. In spite of the lack of local structural details and limited modeling accuracy at the C(alpha) backbone level, we find that useful information about fold classification can be extracted from this procedure. This approach thus provides a way to use a SAXS data based structure prediction algorithm to generate potential structural homologies in cases where lack of sequence homology prevents identification of candidate folds for a given protein. Thus our approach has the potential to help in determination of the biological function of a protein based on structural homology instead of sequence homology.  相似文献   

5.
We use flexible backbone protein design to explore the sequence and structure neighborhoods of naturally occurring proteins. The method samples sequence and structure space in the vicinity of a known sequence and structure by alternately optimizing the sequence for a fixed protein backbone using rotamer based sequence search, and optimizing the backbone for a fixed amino acid sequence using atomic-resolution structure prediction. We find that such a flexible backbone design method better recapitulates protein family sequence variation than sequence optimization on fixed backbones or randomly perturbed backbone ensembles for ten diverse protein structures. For the SH3 domain, the backbone structure variation in the family is also better recapitulated than in randomly perturbed backbones. The potential application of this method as a model of protein family evolution is highlighted by a concerted transition to the amino acid sequence in the structural core of one SH3 domain starting from the backbone coordinates of an homologous structure.  相似文献   

6.
The structure of a pi-type Bence-Jones protein variable fragment Au has been determined by molecular replacement methods using the known structure of an other Bence-Jones variable fragment Rei (Epp et al., Eur J. Biochem. 45, 513 (1974). The crystallographic R factor is 0.31 for about 4000 significantly measured reflections between 6.8 to 2.5 A. The Au protein forms a dimer across a crystallographic two fold axis. The spatial relationship of the two monomers, the conformation of the backbones and of the internal residues is extremely similar to that found in Rei.  相似文献   

7.
8.
Protein C alpha coordinates are used to accurately reconstruct complete protein backbones and side-chain directions. This work employs potentials of mean force to align semirigid peptide groups around the axes that connect successive C alpha atoms. The algorithm works well for all residue types and secondary structure classes and is stable for imprecise C alpha coordinates. Tests on known protein structures show that root mean square errors in predicted main-chain and C beta coordinates are usually less than 0.3 A. These results are significantly more accurate than can be obtained from competing approaches, such as modeling of backbone conformations from structurally homologous fragments.  相似文献   

9.
The ab initio folding problem can be divided into two sequential tasks of approximately equal computational complexity: the generation of native-like backbone folds and the positioning of side chains upon these backbones. The prediction of side-chain conformation in this context is challenging, because at best only the near-native global fold of the protein is known. To test the effect of displacements in the protein backbones on side-chain prediction for folds generated ab initio, sets of near-native backbones (≤ 4 Å Cα RMS error) for four small proteins were generated by two methods. The steric environment surrounding each residue was probed by placing the side chains in the native conformation on each of these decoys, followed by torsion-space optimization to remove steric clashes on a rigid backbone. We observe that on average 40% of the χ1 angles were displaced by 40° or more, effectively setting the limits in accuracy for side-chain modeling under these conditions. Three different algorithms were subsequently used for prediction of side-chain conformation. The average prediction accuracy for the three methods was remarkably similar: 49% to 51% of the χ1 angles were predicted correctly overall (33% to 36% of the χ1+2 angles). Interestingly, when the inter-side-chain interactions were disregarded, the mean accuracy increased. A consensus approach is described, in which side-chain conformations are defined based on the most frequently predicted χ angles for a given method upon each set of near-native backbones. We find that consensus modeling, which de facto includes backbone flexibility, improves side-chain prediction: χ1 accuracy improved to 51–54% (36–42% of χ1+2). Implications of a consensus method for ab initio protein structure prediction are discussed. Proteins 33:204–217, 1998. © 1998 Wiley-Liss, Inc.  相似文献   

10.
Hu X  Kuhlman B 《Proteins》2006,62(3):739-748
Loss of side-chain conformational entropy is an important force opposing protein folding and the relative preferences of the amino acids for being buried or solvent exposed may be partially determined by which amino acids lose more side-chain entropy when placed in the core of a protein. To investigate these preferences, we have incorporated explicit modeling of side-chain entropy into the protein design algorithm, RosettaDesign. In the standard version of the program, the energy of a particular sequence for a fixed backbone depends only on the lowest energy side-chain conformations that can be identified for that sequence. In the new model, the free energy of a single amino acid sequence is calculated by evaluating the average energy and entropy of an ensemble of structures generated by Monte Carlo sampling of amino acid side-chain conformations. To evaluate the impact of including explicit side-chain entropy, sequences were designed for 110 native protein backbones with and without the entropy model. In general, the differences between the two sets of sequences are modest, with the largest changes being observed for the longer amino acids: methionine and arginine. Overall, the identity between the designed sequences and the native sequences does not increase with the addition of entropy, unlike what is observed when other key terms are added to the model (hydrogen bonding, Lennard-Jones energies, and solvation energies). These results suggest that side-chain conformational entropy has a relatively small role in determining the preferred amino acid at each residue position in a protein.  相似文献   

11.
H Du  R J Simpson  R L Moritz  A E Clarke    A Bacic 《The Plant cell》1994,6(11):1643-1653
Arabinogalactan-proteins (AGPs) from the styles of Nicotiana alata were isolated by ion exchange and gel filtration chromatography. After deglycosylation by anhydrous hydrogen fluoride, the protein backbones were fractionated by reversed-phase HPLC. One of the protein backbones, containing mainly hydroxyproline, alanine, and serine residues (53% of total residues), was digested with proteases, and the peptides were isolated and sequenced. This sequence information allowed the cloning of a 712-bp cDNA, AGPNa1. AGPNa1 encodes a 132-amino acid protein with three domains: an N-terminal secretion signal sequence, which is cleaved from the mature protein; a central sequence, which contains most of the hydroxyproline/proline residues; and a C-terminal hydrophobic region. AGPNa1 is expressed in many tissues of N. alata and related species. The arrangement of domains and amino acid composition of the AGP encoded by AGPNa1 are similar to that of an AGP from pear cell suspension culture filtrate, although the only sequence identity is at the N termini of the mature proteins.  相似文献   

12.
It is observed that during divergent evolution of two proteins with a common phylogenetic origin, the structural similarity of their backbones is often preserved even when the sequence similarity between them decreases to a virtually undetectable level. Here we analyzed, whether the conservation of structure along evolution involves also the local atomic structures in the interfaces between secondary structural elements. We have used as study case one protein family, the proteasomal subunits, for which 17 crystal structures are known. These include 14 different subunits of Saccharomyces cerevisiae, 2 subunits of Thermoplasma acidophilum and one subunit of Escherichia coli. The structural core of the 17 proteasomal subunits has 23 secondary structural elements. Any two adjacent secondary structural elements form a molecular interface consisting of two molecular patches. We found 61 interfaces that occurred in all 17 subunits. The 3D shape of equivalent molecular patches from different proteasomal subunits were compared by superposition. Our results demonstrate that pairs of equivalent molecular patches show an RMSD which is lower than that of randomly chosen patches from unrelated proteins. This is true even when patch comparisons with identical residues were excluded from the analysis. Furthermore it is known that the sequential dissimilarity is correlated to the RMSD between the backbones of the members of protein families. The question arises whether this is also true for local atomic structures. The results show that the correlation of individual patch RMSD values and local sequence dissimilarities is low and has a wide range from 0 to 0.41, however, it is surprising that there is a good correlation between the average RMSD of all corresponding patches and the global sequence dissimilarity. This average patch RMSD correlates slightly stronger than the C(alpha)-trace RMSD to the global sequence dissimilarity.  相似文献   

13.
A de novo redesign of the WW domain   总被引:7,自引:0,他引:7  
We have used a sequence prediction algorithm and a novel sampling method to design protein sequences for the WW domain, a small beta-sheet motif. The procedure, referred to as SPANS, designs sequences to be compatible with an ensemble of closely related polypeptide backbones, mimicking the inherent flexibility of proteins. Two designed sequences (termed SPANS-WW1 and SPANS-WW2), using only naturally occurring L-amino acids, were selected for study and the corresponding polypeptides were prepared in Escherichia coli. Circular dichroism data suggested that both purified polypeptides adopted secondary structure features related to those of the target without the aid of disulfide bridges or bound cofactors. The structure exhibited by SPANS-WW2 melted cooperatively by raising the temperature of the solution. Further analysis of this polypeptide by proton nuclear magnetic resonance spectroscopy demonstrated that at 5 degrees C, it folds into a structure closely resembling a natural WW domain. This achievement constitutes one of a small number of successful de novo protein designs through fully automated computational methods and highlights the feasibility of including backbone flexibility in the design strategy.  相似文献   

14.
Protein structure alignment using a genetic algorithm   总被引:3,自引:0,他引:3  
Szustakowski JD  Weng Z 《Proteins》2000,38(4):428-440
We have developed a novel, fully automatic method for aligning the three-dimensional structures of two proteins. The basic approach is to first align the proteins' secondary structure elements and then extend the alignment to include any equivalent residues found in loops or turns. The initial secondary structure element alignment is determined by a genetic algorithm. After refinement of the secondary structure element alignment, the protein backbones are superposed and a search is performed to identify any additional equivalent residues in a convergent process. Alignments are evaluated using intramolecular distance matrices. Alignments can be performed with or without sequential connectivity constraints. We have applied the method to proteins from several well-studied families: globins, immunoglobulins, serine proteases, dihydrofolate reductases, and DNA methyltransferases. Agreement with manually curated alignments is excellent. A web-based server and additional supporting information are available at http://engpub1.bu.edu/-josephs.  相似文献   

15.
Paul Mach  Patrice Koehl 《Proteins》2013,81(9):1556-1570
It is well known that protein fold recognition can be greatly improved if models for the underlying evolution history of the folds are taken into account. The improvement, however, exists only if such evolutionary information is available. To circumvent this limitation for protein families that only have a small number of representatives in current sequence databases, we follow an alternate approach in which the benefits of including evolutionary information can be recreated by using sequences generated by computational protein design algorithms. We explore this strategy on a large database of protein templates with 1747 members from different protein families. An automated method is used to design sequences for these templates. We use the backbones from the experimental structures as fixed templates, thread sequences on these backbones using a self‐consistent mean field approach, and score the fitness of the corresponding models using a semi‐empirical physical potential. Sequences designed for one template are translated into a hidden Markov model‐based profile. We describe the implementation of this method, the optimization of its parameters, and its performance. When the native sequences of the protein templates were tested against the library of these profiles, the class, fold, and family memberships of a large majority (>90%) of these sequences were correctly recognized for an E‐value threshold of 1. In contrast, when homologous sequences were tested against the same library, a much smaller fraction (35%) of sequences were recognized; The structural classification of protein families corresponding to these sequences, however, are correctly recognized (with an accuracy of >88%). Proteins 2013; © 2013 Wiley Periodicals, Inc.  相似文献   

16.
Computational protein design relies on several approximations, including the use of fixed backbones and rotamers, to reduce protein design to a computationally tractable problem. However, allowing backbone and off‐rotamer flexibility leads to more accurate designs and greater conformational diversity. Exhaustive sampling of this additional conformational space is challenging, and often impossible. Here, we report a computational method that utilizes a preselected library of native interactions to direct backbone flexibility to accommodate placement of these functional contacts. Using these native interaction modules, termed motifs, improves the likelihood that the interaction can be realized, provided that suitable backbone perturbations can be identified. Furthermore, it allows a directed search of the conformational space, reducing the sampling needed to find low energy conformations. We implemented the motif‐based design algorithm in Rosetta, and tested the efficacy of this method by redesigning the substrate specificity of methionine aminopeptidase. In summary, native enzymes have evolved to catalyze a wide range of chemical reactions with extraordinary specificity. Computational enzyme design seeks to generate novel chemical activities by altering the target substrates of these existing enzymes. We have implemented a novel approach to redesign the specificity of an enzyme and demonstrated its effectiveness on a model system.  相似文献   

17.
MOTIVATION: Side-chain positioning is a central component of homology modeling and protein design. In a common formulation of the problem, the backbone is fixed, side-chain conformations come from a rotamer library, and a pairwise energy function is optimized. It is NP-complete to find even a reasonable approximate solution to this problem. We seek to put this hardness result into practical context. RESULTS: We present an integer linear programming (ILP) formulation of side-chain positioning that allows us to tackle large problem sizes. We relax the integrality constraint to give a polynomial-time linear programming (LP) heuristic. We apply LP to position side chains on native and homologous backbones and to choose side chains for protein design. Surprisingly, when positioning side chains on native and homologous backbones, optimal solutions using a simple, biologically relevant energy function can usually be found using LP. On the other hand, the design problem often cannot be solved using LP directly; however, optimal solutions for large instances can still be found using the computationally more expensive ILP procedure. While different energy functions also affect the difficulty of the problem, the LP/ILP approach is able to find optimal solutions. Our analysis is the first large-scale demonstration that LP-based approaches are highly effective in finding optimal (and successive near-optimal) solutions for the side-chain positioning problem.  相似文献   

18.
19.
The single-strand-specific nuclease S1 from Aspergillus oryzae rapidly converts superhelical mitochondrial DNA (African Green Monkey cells, Vero ATCC; CCL 81) into nicked circular DNA. These nicked mitochondrial DNA molecules contain two nicks, one in each strand. The phosphodiester backbones are cleaved during this reaction at or near sites that are alkali-labile. In a second slow reaction the circular mitochondrial DNA is converted into a linear duplex DNA. Permutation tests indicate that this linear DNA represents a nonpermutated collection of DNA molecules. These results suggest that two of the alkai-labile sites in the phosphodiester backbones of the mitochondrial chromosome are closely spaced on opposite strands and at specific positions.  相似文献   

20.
ABSTRACT: BACKGROUND: The most widespread, efficient prokaryotic protein-producing system is one where the T7 phage polymerase recognizes the T7 phage promoter (T7 p/p system). Unfortunately, in this system, target protein expression gradually declines and is often undetectable following 3 to 5 subcultures. Although a number of studies have attempted to stabilize the expression levels of the T7 p/p system, none has resolved the problem adequately and thus precludes the use of this system for the production of recombinant proteins on a large scale. RESULTS: We created an expression cassette enabling stable, high-level expression in the T7p/p system. The cassette was tested with two different vector backbones and two target proteins. In all experiments, the expression system using the new cassette exhibited high and stable protein expression levels when compared to the traditional system. CONCLUSIONS: Herein, we describe a universal expression cassette that enables high-level, stable target protein expression in T7 RNA polymerase-based expression systems. We also present the successful use of this cassette as a novel expression platform and demonstrate its ability to overcome the main deficiency of the T7 p/p system. Thus, we provide a method for using the T7 p/p system on an industrial scale.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号