首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
P.R.E.S.S. is an R-package developed to allow researchers to get access to and manipulate a large set of statistical data on protein residue-level structural properties such as residue-level virtual bond lengths, virtual bond angles, and virtual torsion angles. A large set of high-resolution protein structures is downloaded and surveyed. Their residue-level structural properties are calculated and documented. The statistical distributions and correlations of these properties can be queried and displayed. Tools are also provided for modeling and analyzing a given structure in terms of its residue-level structural properties. In particular, new tools for computing residue-level statistical potentials and displaying residue-level Ramachandran-like plots are developed for structural analysis and refinement. P.R.E.S.S. has been released in R as an open source software package, with a user-friendly GUI, accessible and executable by a public user in any R environment. P.R.E.S.S. can also be downloaded directly at http://www.math.iastate.edu/press/.  相似文献   

2.
Here we extend the ability to predict hydrodynamic coefficients and other solution properties of rigid macromolecular structures from atomic-level structures, implemented in the computer program HYDROPRO, to models with lower, residue-level resolution. Whereas in the former case there is one bead per nonhydrogen atom, the latter contains one bead per amino acid (or nucleotide) residue, thus allowing calculations when atomic resolution is not available or coarse-grained models are preferred. We parameterized the effective hydrodynamic radius of the elements in the atomic- and residue-level models using a very large set of experimental data for translational and rotational coefficients (intrinsic viscosity and radius of gyration) for >50 proteins. We also extended the calculations to very large proteins and macromolecular complexes, such as the whole 70S ribosome. We show that with proper parameterization, the two levels of resolution yield similar and rather good agreement with experimental data. The new version of HYDROPRO, in addition to considering various computational and modeling schemes, is far more efficient computationally and can be handled with the use of a graphical interface.  相似文献   

3.
Solid-state NMR is especially useful when the structures of peptides and proteins should be analyzed by taking into account the structural distribution, that is, the distribution of the torsion angle of the individual residue. In this study, two-dimensional spin-diffusion solid-state NMR spectra of 13C-double-labeled model peptides (GPGGA)6G of flagelliform silk were observed for studying the local structure in the solid state. The spin-diffusion NMR spectra calculated by assuming the torsion angles of the beta-spiral structure exclusively could not reproduce the observed spectra. In contrast, the spectra calculated by taking into account the statistical distribution of the torsion angles of the individual central residues in the sequences Ala-Gly-Pro, Gly-Pro-Gly, Pro-Gly-Gly, Gly-Gly-Ala, and Gly-Ala-Gly from PDB data could reproduce the observed spectra well. This indicates that the statistical distribution of the torsion angles should be considered for the structural model of (GPGGA)6G similar to the case of the model peptide of elastin.  相似文献   

4.
We investigate several approaches to coarse grained normal mode analysis on protein residual-level structural fluctuations by choosing different ways of representing the residues and the forces among them. Single-atom representations using the backbone atoms C α , C, N, and C β are considered. Combinations of some of these atoms are also tested. The force constants between the representative atoms are extracted from the Hessian matrix of the energy function and served as the force constants between the corresponding residues. The residue mean-square-fluctuations and their correlations with the experimental B-factors are calculated for a large set of proteins. The results are compared with all-atom normal mode analysis and the residue-level Gaussian Network Model. The coarse-grained methods perform more efficiently than all-atom normal mode analysis, while their B-factor correlations are also higher. Their B-factor correlations are comparable with those estimated by the Gaussian Network Model and in many cases better. The extracted force constants are surveyed for different pairs of residues with different numbers of separation residues in sequence. The statistical averages are used to build a refined Gaussian Network Model, which is able to predict residue-level structural fluctuations significantly better than the conventional Gaussian Network Model in many test cases.  相似文献   

5.
Amir ED  Kalisman N  Keasar C 《Proteins》2008,72(1):62-73
Rotatable torsion angles are the major degrees of freedom in proteins. Adjacent angles are highly correlated and energy terms that rely on these correlations are intensively used in molecular modeling. However, the utility of torsion based terms is not yet fully exploited. Many of these terms do not capture the full scale of the correlations. Other terms, which rely on lookup tables, cannot be used in the context of force-driven algorithms because they are not fully differentiable. This study aims to extend the usability of torsion terms by presenting a set of high-dimensional and fully-differentiable energy terms that are derived from high-resolution structures. The set includes terms that describe backbone conformational probabilities and propensities, side-chain rotamer probabilities, and an elaborate term that couples all the torsion angles within the same residue. The terms are constructed by cubic spline interpolation with periodic boundary conditions that enable full differentiability and high computational efficiency. We show that the spline implementation does not compromise the accuracy of the original database statistics. We further show that the side-chain relevant terms are compatible with established rotamer probabilities. Despite their very local characteristics, the new terms are often able to identify native and native-like structures within decoy sets. Finally, force-based minimization of NMR structures with the new terms improves their torsion angle statistics with minor structural distortion (0.5 A RMSD on average). The new terms are freely available in the MESHI molecular modeling package. The spline coefficients are also available as a documented MATLAB file.  相似文献   

6.
To study the interrelation between the spectral and structural properties of fluorescent proteins, structures of mutants of monomeric red fluorescent protein mRFP1 with all possible point mutations of Glu66 (except replacement by Pro) were simulated by molecular dynamics. A global search for correlations between geometrical structure parameters and some spectral characteristics (absorption maximum wavelength, integral extinction coefficient at the absorption maximum, excitation maximum wavelength, emission maximum wavelength, and quantum yield) was performed for the chromophore and its 6 A environment in mRFP1, Q66A, Q66L, Q66S, Q66C, Q66H, and Q66N. The correlation coefficients (0.81-0.87) were maximal for torsion angles in phenolic and imidazolidine rings as well as for torsion angles in the regions of connection between these rings and chromophore attachment to beta-barrel. The data can be used to predict the spectral properties of fluorescent proteins based on their structures and to reveal promising positions for directed mutagenesis.  相似文献   

7.
The accurate determination of a large number of protein structures by X-ray crystallography makes it possible to conduct a reliable statistical analysis of the distribution of the main-chain and side-chain conformational angles, how these are dependent on residue type, adjacent residue in the sequence, secondary structure, residue-residue interactions and location at the polypeptide chain termini. The interrelationship between the main-chain (phi, psi) and side-chain (chi 1) torsion angles leads to a classification of amino acid residues that simplify the folding alphabet considerably and can be a guide to the design of new proteins or mutational studies. Analyses of residues occurring with disallowed main-chain conformation or with multiple conformations shed some light on why some residues are less favoured in thermophiles.  相似文献   

8.
Due to the limited distance data available from the experiments, the structures determined by NMR Spectroscopy may not always be as accurate as desired. Further refinement of the structures is often required and sometimes critical. With the increase of high quality protein structures determined and deposited in PDB Data Bank, commonly shared protein conformational properties can be extracted based on the statistical distributions of the properties in the structural database and used to improve the outcomes of the NMR-determined structures. Here we examine the distributions of protein interatomic distances in known protein structures. We show that based on these distributions, a set of mean-force potentials can be defined for proteins and employed to refine the NMR-determined structures. We report the test results on 70 NMR-determined structures and compare the potential energy, the Ramachandran plot, and the ensemble RMSD of the structures refined with and without using the derived mean-force potentials.  相似文献   

9.
Residue-level coarse-grained (CG) models have become one of the most popular tools in biomolecular simulations in the trade-off between modeling accuracy and computational efficiency. To investigate large-scale biological phenomena in molecular dynamics (MD) simulations with CG models, unified treatments of proteins and nucleic acids, as well as efficient parallel computations, are indispensable. In the GENESIS MD software, we implement several residue-level CG models, covering structure-based and context-based potentials for both well-folded biomolecules and intrinsically disordered regions. An amino acid residue in protein is represented as a single CG particle centered at the Cα atom position, while a nucleotide in RNA or DNA is modeled with three beads. Then, a single CG particle represents around ten heavy atoms in both proteins and nucleic acids. The input data in CG MD simulations are treated as GROMACS-style input files generated from a newly developed toolbox, GENESIS-CG-tool. To optimize the performance in CG MD simulations, we utilize multiple neighbor lists, each of which is attached to a different nonbonded interaction potential in the cell-linked list method. We found that random number generations for Gaussian distributions in the Langevin thermostat are one of the bottlenecks in CG MD simulations. Therefore, we parallelize the computations with message-passing-interface (MPI) to improve the performance on PC clusters or supercomputers. We simulate Herpes simplex virus (HSV) type 2 B-capsid and chromatin models containing more than 1,000 nucleosomes in GENESIS as examples of large-scale biomolecular simulations with residue-level CG models. This framework extends accessible spatial and temporal scales by multi-scale simulations to study biologically relevant phenomena, such as genome-scale chromatin folding or phase-separated membrane-less condensations.  相似文献   

10.
Zhou H  Zhou Y 《Proteins》2004,55(4):1005-1013
An elaborate knowledge-based energy function is designed for fold recognition. It is a residue-level single-body potential so that highly efficient dynamic programming method can be used for alignment optimization. It contains a backbone torsion term, a buried surface term, and a contact-energy term. The energy score combined with sequence profile and secondary structure information leads to an algorithm called SPARKS (Sequence, secondary structure Profiles and Residue-level Knowledge-based energy Score) for fold recognition. Compared with the popular PSI-BLAST, SPARKS is 21% more accurate in sequence-sequence alignment in ProSup benchmark and 10%, 25%, and 20% more sensitive in detecting the family, superfamily, fold similarities in the Lindahl benchmark, respectively. Moreover, it is one of the best methods for sensitivity (the number of correctly recognized proteins), alignment accuracy (based on the MaxSub score), and specificity (the average number of correctly recognized proteins whose scores are higher than the first false positives) in LiveBench 7 among more than twenty servers of non-consensus methods. The simple algorithm used in SPARKS has the potential for further improvement. This highly efficient method can be used for fold recognition on genomic scales. A web server is established for academic users on http://theory.med.buffalo.edu.  相似文献   

11.
12.
There are many kinds of silks from silkworms and spiders with different structures and properties, and thus, silks are suitable to study the structure-property relationship of fibrous proteins. Silk fibroin from a wild silkworm, Samia cynthia ricini, mainly consists of the repeated similar sequences by about 100 times where there are alternative appearances of the polyalanine (Ala)(12-13) region and the Gly-rich region. In this paper, a sequential model peptide, GGAGGGYGGDGG(A)(12)GGAGDGYGAG, which is a typical sequence of the silk fibroin, was synthesized, and the atomic-level conformations of Gly residues at the N- and C-terminal ends of the polyalanine region were determined as well as that of the central Ala residue using (13)C 2D spin diffusion solid-state nuclear magnetic resonance (NMR) under off-magic angle spinning. In the model peptide with alpha-helical conformation, the torsion angle of the central Ala residue, the 19th Ala, was determined to be (phi, psi) = (-60 degrees, -50 degrees ), which was a typical alpha-helical structure, but the torsion angles of two Gly residues, the 12th and 25th Gly residues, which are located at the N- and C-terminal ends of the polyalanine region, were determined to be (phi, psi) = (-70 degrees, -30 degrees ) and (phi, psi) = (-70 degrees, -20 degrees ), respectively. Thus, it was observed that the turns at both ends of polyalanine with alpha-helix conformation in the model peptide are tightly wound.  相似文献   

13.
A detailed analysis of structural and position dependent characteristic features of helices will give a better understanding of the secondary structure formation in globular proteins. Here we describe an algorithm that quantifies the geometry of helices in proteins on the basis of their C alpha atoms alone. The Fortran program HELANAL can extract the helices from the PDB files and then characterises the overall geometry of each helix as being linear, curved or kinked, in terms of its local structural features, viz. local helical twist and rise, virtual torsion angle, local helix origins and bending angles between successive local helix axes. Even helices with large radius of curvature are unambiguously identified as being linear or curved. The program can also be used to differentiate a kinked helix and other motifs, such as helix-loop-helix or a helix-turn-helix (with a single residue linker) with the help of local bending angles. In addition to these, the program can also be used to characterise the helix start and end as well as other types of secondary structures.  相似文献   

14.
We probe the stability and near-native energy landscape of protein fold space using powerful conformational sampling methods together with simple reduced models and statistical potentials. Fold space is represented by a set of 280 protein domains spanning all topological classes and having a wide range of lengths (33-300 residues) amino acid composition and number of secondary structural elements. The degrees of freedom are taken as the loop torsion angles. This choice preserves the native secondary structure but allows the tertiary structure to change. The proteins are represented by three-point per residue, three-dimensional models with statistical potentials derived from a knowledge-based study of known protein structures. When this space is sampled by a combination of parallel tempering and equi-energy Monte Carlo, we find that the three-point model captures the known stability of protein native structures with stable energy basins that are near-native (all α: 4.77 Å, all β: 2.93 Å, α/β: 3.09 Å, α+β: 4.89 Å on average and within 6 Å for 71.41%, 92.85%, 94.29% and 64.28% for all-α, all-β, α/β and α+β, classes, respectively). Denatured structures also occur and these have interesting structural properties that shed light on the different landscape characteristics of α and β folds. We find that α/β proteins with alternating α and β segments (such as the β-barrel) are more stable than proteins in other fold classes.  相似文献   

15.
It is becoming clear that, in addition to structural properties, the mechanical properties of proteins can play an important role in their biological activity. It nevertheless remains difficult to probe these properties experimentally. Whereas single-molecule experiments give access to overall mechanical behavior, notably the impact of end-to-end stretching, it is currently impossible to directly obtain data on more local properties. We propose a theoretical method for probing the mechanical properties of protein structures at the single-amino acid level. This approach can be applied to both all-atom and simplified protein representations. The probing leads to force constants for local deformations and to deformation vectors indicating the paths of least mechanical resistance. It also reveals the mechanical coupling that exists between residues. Results obtained for a variety of proteins show that the calculated force constants vary over a wide range. An analysis of the induced deformations provides information that is distinct from that obtained with measures of atomic fluctuations and is more easily linked to residue-level properties than normal mode analyses or dynamic trajectories. It is also shown that the mechanical information obtained by residue-level probing opens a new route for defining so-called dynamical domains within protein structures.  相似文献   

16.
Faraggi E  Xue B  Zhou Y 《Proteins》2009,74(4):847-856
This article attempts to increase the prediction accuracy of residue solvent accessibility and real-value backbone torsion angles of proteins through improved learning. Most methods developed for improving the backpropagation algorithm of artificial neural networks are limited to small neural networks. Here, we introduce a guided-learning method suitable for networks of any size. The method employs a part of the weights for guiding and the other part for training and optimization. We demonstrate this technique by predicting residue solvent accessibility and real-value backbone torsion angles of proteins. In this application, the guiding factor is designed to satisfy the intuitive condition that for most residues, the contribution of a residue to the structural properties of another residue is smaller for greater separation in the protein-sequence distance between the two residues. We show that the guided-learning method makes a 2-4% reduction in 10-fold cross-validated mean absolute errors (MAE) for predicting residue solvent accessibility and backbone torsion angles, regardless of the size of database, the number of hidden layers and the size of input windows. This together with introduction of two-layer neural network with a bipolar activation function leads to a new method that has a MAE of 0.11 for residue solvent accessibility, 36 degrees for psi, and 22 degrees for phi. The method is available as a Real-SPINE 3.0 server in http://sparks.informatics.iupui.edu.  相似文献   

17.
Intrinsically disordered proteins (IDPs)/regions do not have well‐defined secondary and tertiary structures, however, they are functional and it is critical to gain a deep understanding of their residue packing. The shape distributions methodology, which is usually utilized in pattern recognition, clustering, and classification studies in computer science, may be adopted to study the residue packing of the proteins. In this study, shape distributions of the globular proteins and IDPs were obtained to shed light on the residue packing of their structures. The shape feature that was used is the sphericity of tetrahedra obtained by Delaunay Tessellation of points of Cα coordinates. Then the sphericity probability distributions were compared by using Principal Component Analysis. This computational structural study shows that the set of IDPs constitute a more diverse set than the set of globular proteins in terms of the geometrical properties of their network structures.  相似文献   

18.
Statistical energy functions are general models about atomic or residue-level interactions in biomolecules, derived from existing experimental data. They provide quantitative foundations for structural modeling as well as for structure-based protein sequence design. Statistical energy functions can be derived computationally either based on statistical distributions or based on variational assumptions. We present overviews on the theoretical assumptions underlying the various types of approaches. Theoretical considerations underlying important pragmatic choices are discussed.  相似文献   

19.
Designing protein sequences that fold to a given three-dimensional (3D) structure has long been a challenging problem in computational structural biology with significant theoretical and practical implications. In this study, we first formulated this problem as predicting the residue type given the 3D structural environment around the C α atom of a residue, which is repeated for each residue of a protein. We designed a nine-layer 3D deep convolutional neural network (CNN) that takes as input a gridded box with the atomic coordinates and types around a residue. Several CNN layers were designed to capture structure information at different scales, such as bond lengths, bond angles, torsion angles, and secondary structures. Trained on a very large number of protein structures, the method, called ProDCoNN (protein design with CNN), achieved state-of-the-art performance when tested on large numbers of test proteins and benchmark datasets.  相似文献   

20.
Dor O  Zhou Y 《Proteins》2007,68(1):76-81
Proteins can move freely in three-dimensional space. As a result, their structural properties, such as solvent accessible surface area, backbone dihedral angles, and atomic distances, are continuous variables. However, these properties are often arbitrarily divided into a few classes to facilitate prediction by statistical learning techniques. In this work, we establish an integrated system of neural networks (called Real-SPINE) for real-value prediction and apply the method to predict residue-solvent accessibility and backbone psi dihedral angles of proteins based on information derived from sequences only. Real-SPINE is trained with a large data set of 2640 protein chains, sequence profiles generated from multiple sequence alignment, representative amino-acid properties, a slow learning rate, overfitting protection, and predicted secondary structures. The method optimizes more than 200,000 weights and yields a 10-fold cross-validated Pearson's correlation coefficient (PCC) of 0.74 between predicted and actual solvent accessible surface areas and 0.62 between predicted and actual psi angles. In particular, 90% of 2640 proteins have a PCC value greater than 0.6 between predicted and actual solvent-accessible surface areas. The results of Real-SPINE can be compared with the best reported correlation coefficients of 0.64-0.67 for solvent-accessible surface areas and 0.47 for psi angles. The real-SPINE server, executable programs, and datasets are freely available on http://sparks.informatics.iupui.edu.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号