首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Recently we developed methods for the construction of knowledge-based mean fields from a data base of known protein structures. As shown previously, this approach can be used to calculate ensembles of probable conformations for short fragments of polypeptide chains. Here we develop procedures for the assembly of short fragments to complete three-dimensional models of polypeptide chains. The amino acid sequence of a given protein is decomposed into all possible overlapping fragments of a given length, and an ensemble of probable conformations is calculated for each fragment. The fragments are assembled to a complete model by choosing appropriate conformations from the individual ensembles and by averaging over equivalent angles. Finally a consistent model is obtained by rebuilding the conformation from the average angles. From the average angles the local variability of the structure can be calculated, which is a useful criterion for the reliability of the model. The procedure is applied to the calculation of the local backbone conformations of myoglobin and lysozyme whose structures have been solved by X-ray analysis and thymosin beta 4, a polypeptide of 43 amino acid residues whose structure was recently investigated by NMR spectroscopy. We demonstrate that substantial fractions of the calculated local backbone conformations are similar to the experimentally determined structures.  相似文献   

2.
Kannan S  Zacharias M 《Proteins》2007,66(3):697-706
During replica exchange molecular dynamics (RexMD) simulations, several replicas of a system are simulated at different temperatures in parallel allowing for exchange between replicas at frequent intervals. This technique allows significantly improved sampling of conformational space and is increasingly being used for structure prediction of peptides and proteins. A drawback of the standard temperature RexMD is the rapid increase of the replica number with increasing system size to cover a desired temperature range. In an effort to limit the number of replicas, a new Hamiltonian-RexMD method has been developed that is specifically designed to enhance the sampling of peptide and protein conformations by applying various levels of a backbone biasing potential for each replica run. The biasing potential lowers the barrier for backbone dihedral transitions and promotes enhanced peptide backbone transitions along the replica coordinate. The application on several peptide cases including in all cases explicit solvent indicates significantly improved conformational sampling when compared with standard MD simulations. This was achieved with a very modest number of 5-7 replicas for each simulation system making it ideally suited for peptide and protein folding simulations as well as refinement of protein model structures in the presence of explicit solvent.  相似文献   

3.
PKB: a program system and data base for analysis of protein structure   总被引:2,自引:0,他引:2  
S H Bryant 《Proteins》1989,5(3):233-247
PKB is a computer program system that combines a data base of three-dimensional protein structures with a series of algorithms for pattern recognition, data analysis, and graphics. By typing relatively simple commands the user may search the data base for instances of a structural motif and analyze in detail the set of individual structures that are found. The application of PKB to the study of protein folding is illustrated in three examples. The first analysis compares the conformations observed for a short sequential motif, sequences similar to the cell-attachment signal Arg-Gly-Asp. The second compares sequences observed for a conformational motif, a 16-residue beta alpha beta unit. The third analysis considers a population of substructures containing ion-pair interactions, examining the relationship of frequency of occurrence to calculated electrostatic energy.  相似文献   

4.
Homology modeling methods have been used to construct models of two proteins—the histidine-containing phosphocarrier protein (HPr) from Mycoplasma capricolum and human eosinophil-derived neurotoxin (EDN). Comparison of the models with the subsequently determined X-ray crystal structures indicates that the core regions of both proteins are reasonably well reproduced, although the template structures are closer to the X-ray structures in these regions—possible enhancements are discussed. The conformations of most of the side chains in the core of HPr are well reproduced in the modeled structure. As expected, the conformations of surface side chains in this protein differ significantly from the X-ray structure. The loop regions of EDN were incorrectly modeled—reasons for this and possible enhancements are discussed. © 1995 Wiley-Liss, Inc.  相似文献   

5.
6.
We present a new method, secondary structure prediction by deviation parameter (SSPDP) for predicting the secondary structure of proteins from amino acid sequence. Deviation parameters (DP) for amino acid singlets, doublets and triplets were computed with respect to secondary structural elements of proteins based on the dictionary of secondary structure prediction (DSSP)-generated secondary structure for 408 selected nonhomologous proteins. To the amino acid triplets which are not found in the selected dataset, a DP value of zero is assigned with respect to the secondary structural elements of proteins. The total number of parameters generated is 15,432, in the possible parameters of 25,260. Deviation parameter is complete with respect to amino acid singlets, doublets, and partially complete with respect to amino acid triplets. These generated parameters were used to predict secondary structural elements from amino acid sequence. The secondary structure predicted by our method (SSPDP) was compared with that of single sequence (NNPREDICT) and multiple sequence (PHD) methods. The average value of the percentage of prediction accuracy for αhelix by SSPDP, NNPREDICT and PHD methods was found to be 57%, 44% and 69% respectively for the proteins in the selected dataset. For Β-strand the prediction accuracy is found to be 69%, 21% and 53% respectively by SSPDP, NNPREDICT and PHD methods. This clearly indicates that the secondary structure prediction by our method is as good as PHD method but much better than NNPREDICT method.  相似文献   

7.
Langevin dynamics is used with our physics-based united-residue (UNRES) force field to study the folding pathways of the B-domain of staphylococcal protein A (1BDD (alpha; 46 residues)). With 400 trajectories of protein A started from the extended state (to gather meaningful statistics), and simulated for more than 35 ns each, 380 of them folded to the native structure. The simulations were carried out at the optimal folding temperature of protein A with this force field. To the best of our knowledge, this is the first simulation study of protein-folding kinetics with a physics-based force field in which reliable statistics can be gathered. In all the simulations, the C-terminal alpha-helix forms first. The ensemble of the native basin has an average RMSD value of 4 A from the native structure. There is a stable intermediate along the folding pathway, in which the N-terminal alpha-helix is unfolded; this intermediate appears on the way to the native structure in less than one-fourth of the folding pathways, while the remaining ones proceed directly to the native state. Non-native structures persist until the end of the simulations, but the native-like structures dominate. To express the kinetics of protein A folding quantitatively, two observables were used: (i) the average alpha-helix content (averaged over all trajectories within a given time window); and (ii) the fraction of conformations (averaged over all trajectories within a given time window) with Calpha RMSD values from the native structure less than 5 A (fraction of completely folded structures). The alpha-helix content grows quickly with time, and its variation fits well to a single-exponential term, suggesting fast two-state kinetics. On the other hand, the fraction of folded structures changes more slowly with time and fits to a sum of two exponentials, in agreement with the appearance of the intermediate, found when analyzing the folding pathways. This observation demonstrates that different qualitative and quantitative conclusions about folding kinetics can be drawn depending on which observable is monitored.  相似文献   

8.
Summary We examine in this paper one of the expected consequences of the hypothesis that modern proteins evolved from random heteropeptide sequences. Specifically, we investigate the lengthwise distributions of amino acids in a set of 1,789 protein sequences with little sequence identity using the run test statistic (r o) of Mood (1940,Ann. Math. Stat. 11, 367–392). The probability density ofr o for a collection of random sequences has mean=0 and variance=1 [the N(0,1) distribution] and can be used to measure the tendency of amino acids of a given type to cluster together in a sequence relative to that of a random sequence. We implement the run test using binary representations of protein sequences in which the amino acids of interest are assigned a value of 1 and all others a value of 0. We consider individual amino acids and sets of various combinations of them based upon hydrophobicity (4 sets), charge (3 sets), volume (4 sets), and secondary structure propensity (3 sets). We find that any sequence chosen randomly has a 90% or greater chance of having a lengthwise distribution of amino acids that is indistinguishable from the random expectation regardless of amino acid type. We regard this as strong support for the random-origin hypothesis. However, we do observe significant deviations from the random expectation as might be expected after billions years of evolution. Two important global trends are found: (1) Amino acids with a strong α-helix propensity show a strong tendency to cluster whereas those with β-sheet or reverse-turn propensity do not. (2) Clustered rather than evenly distributed patterns tend to be preferred by the individual amino acids and this is particularly so for methionine. Finally, we consider the problem of reconciling the random nature of protein sequences with structurally meaningful periodic “patterns” that can be detected by sliding-window, autocorrelation, and Fourier analyses. Two examples, rhodopsin and bacteriorhodopsin, show that such patterns are a natural feature of random sequences.  相似文献   

9.
We have updated the Protein Sequence-Structure Analysis Relational Database (PSSARD) first published in the Int. J. Biol. Macromol. 36 (2005) 259-262 corresponding to 1573 representative protein chains selected from the Protein Data Bank (PDB). In this, the updated and revised PSSARD (Version 2.0), we have included all proteins in the Protein Data Bank available at the time of developing this database including the NMR PDB entries. The current database corresponds to 22,752 XRAY PDB entries and 3977 NMR PDB entries and is separated accordingly in order to facilitate the appropriate database search. The representative protein chains can also be separately accessed within the current database. We have made a provision to combine more than one field to query the database and the results of any search can be used to carry out further nested searches using a combination of queries. We have provided hyperlinks to the individual PDB entries obtained as the result of any search in PSSARD in order to obtain additional details relevant to the protein structure. Certain applications useful to identify domains and structural motifs are discussed.  相似文献   

10.
A computer algorithm, CLIX, capable of searching a crystallographic data-base of small molecules for candidates which have both steric and chemical likelihood of binding a protein of known three-dimensional structure is presented. The algorithm is a significant advance over previous strategies which consider solely steric or chemical requirements for binding. The algorithm is shown to be capable of predicting the correct binding geometry of sialic acid to a mutant influenza-virus hemagglutinin and of proposing a number of potential new ligands to this protein.  相似文献   

11.
Locating protein coding regions in genomic DNA is a critical step in accessing the information generated by large scale sequencing projects. Current methods for gene detection depend on statistical measures of content differences between coding and noncoding DNA in addition to the recognition of promoters, splice sites, and other regulatory sites. Here we explore the potential value of recurrent amino acid sequence patterns 3-19 amino acids in length as a content statistic for use in gene finding approaches. A finite mixture model incorporating these patterns can partially discriminate protein sequences which have no (detectable) known homologs from randomized versions of these sequences, and from short (< or = 50 amino acids) non-coding segments extracted from the S. cerevisiea genome. The mixture model derived scores for a collection of human exons were not correlated with the GENSCAN scores, suggesting that the addition of our protein pattern recognition module to current gene recognition programs may improve their performance.  相似文献   

12.
A Caflisch  P Niederer  M Anliker 《Proteins》1992,14(1):102-109
A new minimization procedure for the global optimization in cartesian coordinate space of the conformational energy of a polypeptide chain is presented. The Metropolis Monte Carlo minimization is thereby supplemented by a thermalization process, which is initiated whenever a structure becomes trapped in an area containing closely located local minima in the conformational space. The method has been applied to the endogenous opioid pentapeptide methionine enkephalin. Five among 13 different starting conformations led to the same apparent global minimum of an in-house developed energy function, a type II' reverse turn, the central residues of which are Gly-3-Phe-4. A comparison between the ECEPP/2 global minimum conformation of methionine enkephalin and the apparent one achieved by the present method shows that minimum-energy conformations having a certain similarity can be generated by relatively different force fields.  相似文献   

13.
We introduce a new algorithm, IRECS (Iterative REduction of Conformational Space), for identifying ensembles of most probable side-chain conformations for homology modeling. On the basis of a given rotamer library, IRECS ranks all side-chain rotamers of a protein according to the probability with which each side chain adopts the respective rotamer conformation. This ranking enables the user to select small rotamer sets that are most likely to contain a near-native rotamer for each side chain. IRECS can therefore act as a fast heuristic alternative to the Dead-End-Elimination algorithm (DEE). In contrast to DEE, IRECS allows for the selection of rotamer subsets of arbitrary size, thus being able to define structure ensembles for a protein. We show that the selection of more than one rotamer per side chain is generally meaningful, since the selected rotamers represent the conformational space of flexible side chains. A knowledge-based statistical potential ROTA was constructed for the IRECS algorithm. The potential was optimized to discriminate between side-chain conformations of native and rotameric decoys of protein structures. By restricting the number of rotamers per side chain to one, IRECS can optimize side chains for a single conformation model. The average accuracy of IRECS for the chi1 and chi1+2 dihedral angles amounts to 84.7% and 71.6%, respectively, using a 40 degrees cutoff. When we compared IRECS with SCWRL and SCAP, the performance of IRECS was comparable to that of both methods. IRECS and the ROTA potential are available for download from the URL http://irecs.bioinf.mpi-inf.mpg.de.  相似文献   

14.
15.
Refinement of distance geometry (DG) structures of EETI-II (Heitz et al.: Biochemistry 28:2392-2398, 1989), a member of the squash family trypsin inhibitor, have been carried out by restrained molecular dynamics (RMD) in water. The resulting models show better side chain apolar/polar surface ratio and estimated solvation free energy than structures refined "in vacuo." The consistent lower values of residual NMR constraint violations, apolar/polar surface ratio, and solvation free energy for one of these refined structures allowed prediction of the 3D folding and disulfide connectivity of EETI-II. Except for the few first residues for which no NMR constraints were available, this computer model fully agreed with X-ray structures of CMTI-I (Bode et al.: FEBS Lett. 242:285-292, 1989) and EETI-II complexed with trypsin that appeared after the RMD simulation was completed. Restrained molecular dynamics in water is thus proved to be highly valuable for refinement of DG structures. Also, the successful use of apolar/polar surface ratio and of solvation free energy reinforce the analysis of Novotny et al. (Proteins 4:19-30, 1988) and shows that these criteria are useful indicators of correct versus misfolded models.  相似文献   

16.
Biased usage of synonymous codons has been elucidated under the perspective of cellular tRNA abundance for quite a long time now. Taking advantage of publicly available gene expression data for Saccharomyces cerevisiae, a systematic analysis of the codon and amino acid usages in two different coding regions corresponding to the regular (helix and strand) as well as the irregular (coil) protein secondary structures, have been performed. Our analyses suggest that apart from tRNA abundance, mRNA folding stability is another major evolutionary force in shaping the codon and amino acid usage differences between the highly and lowly expressed genes in S. cerevisiae genome and surprisingly it depends on the coding regions corresponding to the secondary structures of the encoded proteins. This is obviously a new paradigm in understanding the codon usage in S. cerevisiae. Differential amino acid usage between highly and lowly expressed genes in the regions coding for the irregular protein secondary structure in S. cerevisiae is expounded by the stability of the mRNA folded structure. Irrespective of the protein secondary structural type, the highly expressed genes always tend to encode cheaper amino acids in order to reduce the overall biosynthetic cost of production of the corresponding protein. This study supports the hypothesis that the tRNA abundance is a consequence of and not a reason for the biased usage of amino acid between highly and lowly expressed genes.  相似文献   

17.
Generation of full protein coordinates from limited information, e.g., the Cα coordinates, is an important step in protein homology modeling and structure determination, and molecular dynamics (MD) simulations may prove to be important in this task. We describe a new method, in which the protein backbone is built quickly in a rather crude way and then refined by minimization techniques. Subsequently, the side chains are positioned using extensive MD calculations. The method is tested on two proteins, and results compared to proteins constructed using two other MD-based methods. In the first method, we supplemented an existing backbone building method with a new procedure to add side chains. The second one largely consists of available methodology. The constructed proteins are compared to the corresponding X-ray structures, which became available during this study, and they are in good agreement (backbone RMS values of 0.5–0.7 Å, and all-atom RMS values of 1.5–1.9 Å). This comparative study indicates that extensive MD simulations are able, to some extent, to generate details of the native protein structure, and may contribute to the development of a standardized methodology to predict reliably (parts of) protein structures when only partial coordinate data are available. © 1994 John Wiley & Sons, Inc.  相似文献   

18.
The results of two 30-ps molecular dynamics simulations of the trp repressor and trp aporepressor proteins are presented in this paper. The simulations were obtained using the AMBER molecular mechanical force field and in both simulations a 6-A shell of TIP3P waters surrounded the proteins. The trp repressor protein is a DNA-binding regulatory protein and it utilizes a helix-turn-helix (D helix-turn-E helix) motif to interact with DNA. The trp aporepressor, lacking two molecules of the L-tryptophan corepressor, cannot bind specifically to DNA. Our simulations show that the N- and C-termini and the residues in and near the helix-turn-helix motifs are the most mobile regions of the proteins, in agreement with the X-ray crystallographic studies. Our simulations also find increased mobility of the residues in the turn-D helix-turn regions of the proteins. We find the average distance separating the DNA-binding motifs to be larger in the repressor as compared to the aporepressor. In addition to examining the protein residue fluctuations and deviations with respect to X-ray structures, we have also focused on backbone dihedral angles and corepressor hydrogen-bonding patterns in this paper.  相似文献   

19.
An automated method for the optimal placement of polar hydrogens in a protein structure is described. This method treats the polar, side chain hydrogens of lysine, serine, threonine, and tyrosine and the amino terminus of a protein. The program, called NETWORK, divides the potential hydrogen-bonding pairs of a protein into groups of interacting donors and acceptors. A search is conducted on each of the local groups to find an arrangement which forms the most hydrogen bonds. If two or more arrangements have the same number of hydrogen bonds, the arrangement with the shortest set of hydrogen bonds is selected. The polar hydrogens of the histidyl side chain are specifically treated, and the ionization state of this residue is allowed to change, if this change results in additional hydrogen bonds for the local group. The program will accept Protein Data Bank as well as Biosym-format coordinate files. Input and output routines can be easily modified to accept other coordinate file formats. The predictions from this method are compared to known hydrogen positions for bovine pancreatic trypsin inhibitor, insulin, RNase-A, and trypsin for which the neutron diffraction structures have been determined. The usefulness of this program is further demonstrated by a comparison of molecular dynamics simulations for the enzyme cytochrome P-450cam with and without using NETWORK.  相似文献   

20.
α-Prolamins are the major seed storage proteins of species of the grass tribe Andropogonea. They are unusually rich in glutamine, proline, alanine, and leucine residues and their sequences show a series of tandem repeats presumed to be the result of multiple intragenic duplication. Two new sequences of α-prolamin clones from Coix (pBCX25.12 and pBCX25.10) are compared with similar clones from maize and Sorghum in order to investigate evolutionary relationships between the repeat motifs and to propose a schematic model for their three-dimensional structure based on hydrophobic membrane-helix propensities and helical “wheels.” A scheme is proposed for the most recent events in the evolution of the central part of the molecule (repeats 3 to 8) which involves two partial intragenic duplications and in which contemporary odd-numbered and even-numbered repeats arise from common ancestors, respectively. Each pair of repeats is proposed to form an antiparallel α-helical hairpin and that the helices of the molecule as a whole are arranged on a hexagonal net. The majority of helices show six faces of alternating hydrophobic and polar residues, which give rise to intersticial holes around each helix which alternate in chemical character. The model is consistent with proteins which contain different numbers of repeats, with oligomerization and with the dense packaging of α-prolamins within the protein body of the seed endosperm. © 1993 Wiley-Liss, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号