首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We have developed a new methodology that determines protein structures using small-angle X-ray scattering (SAXS) data. The current bottlenecks in determining the protein structures require a new strategy using the simple design of an experiment, and SAXS is suitable for this purpose in spite of its low information content. First we demonstrated that SAXS constraints work additively to NMR-derived information in calculating structures. Next, structure calculations for nine proteins taking different folds were performed using the SAXS constraints combined with the NMR-derived distance restraints for local geometry such as secondary structures or those for tertiary structure. The results show that the SAXS constraints complemented the tertiary-structural information for all the proteins, and that accuracy of the structures thus obtained with SAXS constraints and local geometrical restraints ranged from 1.85 to 4.33 Å. Based on these results, we were able to construct a coarse-grained protein model at amino acid residue resolution.  相似文献   

2.
The problem of constructing all-atom model co-ordinates of a protein from an outline of the polypeptide chain is encountered in protein structure determination by crystallography or nuclear magnetic resonance spectroscopy, in model building by homology and in protein design. Here, we present an automatic procedure for generating full protein co-ordinates (backbone and, optionally, side-chains) given the C alpha trace and amino acid sequence. To construct backbones, a protein structure database is first scanned for fragments that locally fit the chain trace according to distance criteria. A best path algorithm then sifts through these segments and selects an optimal path with minimal mismatch at fragment joints. In blind tests, using fully known protein structures, backbones (C alpha, C, N, O) can be reconstructed with a reliability of 0.4 to 0.6 A root-mean-square position deviation and not more than 0 to 5% peptide flips. This accuracy is sufficient to identify possible errors in protein co-ordinate sets. To construct full co-ordinates, side-chains are added from a library of frequently occurring rotamers using a simple and fast Monte Carlo procedure with simulated annealing. In tests on X-ray structures determined at better than 2.5 A resolution, the positions of side-chain atoms in the protein core (less than 20% relative accessibility) have an accuracy of 1.6 A (r.m.s. deviation) and 70% of chi 1 angles are within 30 degrees of the X-ray structure. The computer program MaxSprout is available on request.  相似文献   

3.
The principal bottleneck in protein structure prediction is the refinement of models from lower accuracies to the resolution observed by experiment. We developed a novel constraints‐based refinement method that identifies a high number of accurate input constraints from initial models and rebuilds them using restrained torsion angle dynamics (rTAD). We previously created a Bayesian statistics‐based residue‐specific all‐atom probability discriminatory function (RAPDF) to discriminate native‐like models by measuring the probability of accuracy for atom type distances within a given model. Here, we exploit RAPDF to score (i.e., filter) constraints from initial predictions that may or may not be close to a native‐like state, obtain consensus of top scoring constraints amongst five initial models, and compile sets with no redundant residue pair constraints. We find that this method consistently produces a large and highly accurate set of distance constraints from which to build refinement models. We further optimize the balance between accuracy and coverage of constraints by producing multiple structure sets using different constraint distance cutoffs, and note that the cutoff governs spatially near versus distant effects in model generation. This complete procedure of deriving distance constraints for rTAD simulations improves the quality of initial predictions significantly in all cases evaluated by us. Our procedure represents a significant step in solving the protein structure prediction and refinement problem, by enabling the use of consensus constraints, RAPDF, and rTAD for protein structure modeling and refinement. Proteins 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

4.
Voltage-gated sodium channels are targets for many drugs and toxins. However, the rational design of medically relevant channel modulators is hampered by the lack of x-ray structures of eukaryotic channels. Here, we used a homology model based on the x-ray structure of the NavAb prokaryotic sodium channel together with published experimental data to analyze interactions of the μ-conotoxins GIIIA, PIIIA, and KIIIA with the Nav1.4 eukaryotic channel. Using Monte Carlo energy minimizations and published experimentally defined pairwise contacts as distance constraints, we developed a model in which specific contacts between GIIIA and Nav1.4 were readily reproduced without deformation of the channel or toxin backbones. Computed energies of specific interactions between individual residues of GIIIA and the channel correlated with experimental estimates. The predicted complexes of PIIIA and KIIIA with Nav1.4 are consistent with a large body of experimental data. In particular, a model of Nav1.4 interactions with KIIIA and tetrodotoxin (TTX) indicated that TTX can pass between Nav1.4 and channel-bound KIIIA to reach its binding site at the selectivity filter. Our models also allowed us to explain experimental data that currently lack structural interpretations. For instance, consistent with the incomplete block observed with KIIIA and some GIIIA and PIIIA mutants, our computations predict an uninterrupted pathway for sodium ions between the extracellular space and the selectivity filter if at least one of the four outer carboxylates is not bound to the toxin. We found a good correlation between computational and experimental data on complete and incomplete channel block by native and mutant toxins. Thus, our study suggests similar folding of the outer pore region in eukaryotic and prokaryotic sodium channels.  相似文献   

5.
We present a comprehensive analysis of methods for improving the fold recognition rate of the threading approach to protein structure prediction by the utilization of few additional distance constraints. The distance constraints between protein residues may be obtained by experiments such as mass spectrometry or NMR spectroscopy. We applied a post-filtering step with new scoring functions incorporating measures of constraint satisfaction to ranking lists of 123D threading alignments. The detailed analysis of the results on a small representative benchmark set show that the fold recognition rate can be improved significantly by up to 30% from about 54%-65% to 77%-84%, approaching the maximal attainable performance of 90% estimated by structural superposition alignments. This gain in performance adds about 10% to the recognition rate already achieved in our previous study with cross-link constraints only. Additional recent results on a larger benchmark set involving a confidence function for threading predictions also indicate notable improvements by our combined approach, which should be particularly valuable for rapid structure determination and validation of protein models.  相似文献   

6.
We report the application of an integrated computational approach for biomolecular structure determination at a low resolution. In particular, a neural network is trained to predict the spatial proximity of C-alpha atoms that are less than a given threshold apart, whereas a Kalman filter algorithm is employed to outline the biomolecular fold, with a constraints set that includes these pairwise atomic distances, and the distances and angles that define the structure as it is known from the protein's sequence. The results for Crambin demonstrate that this integrated approach is useful for molecular structure prediction at a low resolution and may also complement existing experimental distance data for a protein structure determination. © 1996 John Wiley & Sons, Inc.  相似文献   

7.
Many proteins are composed of several domains that pack together into a complex tertiary structure. Multidomain proteins can be challenging for protein structure modeling, particularly those for which templates can be found for individual domains but not for the entire sequence. In such cases, homology modeling can generate high quality models of the domains but not for the orientations between domains. Small-angle X-ray scattering (SAXS) reports the structural properties of entire proteins and has the potential for guiding homology modeling of multidomain proteins. In this article, we describe a novel multidomain protein assembly modeling method, SAXSDom that integrates experimental knowledge from SAXS with probabilistic Input-Output Hidden Markov model to assemble the structures of individual domains together. Four SAXS-based scoring functions were developed and tested, and the method was evaluated on multidomain proteins from two public datasets. Incorporation of SAXS information improved the accuracy of domain assembly for 40 out of 46 critical assessment of protein structure prediction multidomain protein targets and 45 out of 73 multidomain protein targets from the ab initio domain assembly dataset. The results demonstrate that SAXS data can provide useful information to improve the accuracy of domain-domain assembly. The source code and tool packages are available at https://github.com/jianlin-cheng/SAXSDom .  相似文献   

8.
Small angle X‐ray scattering (SAXS) is an experimental technique used for structural characterization of macromolecules in solution. Here, we introduce BCL::SAXS—an algorithm designed to replicate SAXS profiles from rigid protein models at different levels of detail. We first show our derivation of BCL::SAXS and compare our results with the experimental scattering profile of hen egg white lysozyme. Using this protein we show how to generate SAXS profiles representing: (1) complete models, (2) models with approximated side chain coordinates, and (3) models with approximated side chain and loop region coordinates. We evaluated the ability of SAXS profiles to identify a correct protein topology from a non‐redundant benchmark set of proteins. We find that complete SAXS profiles can be used to identify the correct protein by receiver operating characteristic (ROC) analysis with an area under the curve (AUC) > 99%. We show how our approximation of loop coordinates between secondary structure elements improves protein recognition by SAχS for protein models without loop regions and side chains. Agreement with SAXS data is a necessary but not sufficient condition for structure determination. We conclude that experimental SAXS data can be used as a filter to exclude protein models with large structural differences from the native. Proteins 2015; 83:1500–1512. © 2015 Wiley Periodicals, Inc.  相似文献   

9.
We attempted to predict through computer modeling the structure of the light-harvesting complex II (LH-II) of Rhodospirillum molischianum, before the impending publication of the structure of a homologous protein solved by means of X-ray diffraction. The protein studied is an integral membrane protein of 16 independent polypeptides, 8 alpha-apoproteins and 8 beta-apoproteins, which aggregate and bind to 24 bacteriochlorophyll-a's and 12 lycopenes. Available diffraction data of a crystal of the protein, which could not be phased due to a lack of heavy metal derivatives, served to test the predicted structure, guiding the search. In order to determine the secondary structure, hydropathy analysis was performed to identify the putative transmembrane segments and multiple sequence alignment propensity analyses were used to pinpoint the exact sites of the 20-residue-long transmembrane segment and the 4-residue-long terminal sequence at both ends, which were independently verified and improved by homology modeling. A consensus assignment for the secondary structure was derived from a combination of all the prediction methods used. Three-dimensional structures for the alpha- and the beta-apoprotein were built by comparative modeling. The resulting tertiary structures are combined, using X-PLOR, into an alpha beta dimer pair with bacteriochlorophyll-a's attached under constraints provided by site-directed mutagenesis and spectral data. The alpha beta dimer pairs were then aggregated into a quaternary structure through further molecular dynamics simulations and energy minimization. The structure of LH-II so determined is an octamer of alpha beta heterodimers forming a ring with a diameter of 70 A.  相似文献   

10.
An automatic procedure for building a protein polyalanine backbone from C alpha positions and 'spare parts' retrieved from a data base of 66 high-resolution protein structures is described. Protein backbones are constructed from overlapping fragments of variable length, which allows the backbone of regular secondary structure elements to be built in one block. The procedure is shown to yield backbones which compare very favourably with those from highly refined X-ray structures (r.m.s. deviation between generated and crystal structures less than 1A). The method is furthermore quite insensitive to experimental errors in C alpha positions as well as to the size of the data base, and is seen to yield valuable insight into the relationships between sequence and 3-D structure: one example on triose phosphate isomerase, a beta-barrel protein, shows that beta alpha loops can be considered as structurally more uncommon than alpha beta loops. The 'spare parts' approach is also found to be useful for general-purpose modelling of local structural changes produced by insertion or deletion of residues. It should, however, be used with caution. Crude selection criteria based solely on fragment length and geometric fit to the loop base regions yield realistic backbones in about two-thirds of the test cases (r.m.s. deviations from refined crystal structure approximately 1A). In the remaining cases, sequence information, in particular the presence of glycine residues which tend to adopt more unusual backbone conformations, must be considered to obtain comparable results.  相似文献   

11.
Knowledge of structural class plays an important role in understanding protein folding patterns. So it is necessary to develop effective and reliable computational methods for prediction of protein structural class. To this end, we present a new method called NN-CDM, a nearest neighbor classifier with a complexity-based distance measure. Instead of extracting features from protein sequences as done previously, distance between each pair of protein sequences is directly evaluated by a complexity measure of symbol sequences. Then the nearest neighbor classifier is adopted as the predictive engine. To verify the performance of this method, jackknife cross-validation tests are performed on several benchmark datasets. Results show that our approach achieves a high prediction accuracy over some classical methods.  相似文献   

12.
13.
We outline a set of strategies to infer protein function from structure. The overall approach depends on extensive use of homology modeling, the exploitation of a wide range of global and local geometric relationships between protein structures and the use of machine learning techniques. The combination of modeling with broad searches of protein structure space defines a “structural BLAST” approach to infer function with high genomic coverage. Applications are described to the prediction of protein–protein and protein–ligand interactions. In the context of protein–protein interactions, our structure‐based prediction algorithm, PrePPI, has comparable accuracy to high‐throughput experiments. An essential feature of PrePPI involves the use of Bayesian methods to combine structure‐derived information with non‐structural evidence (e.g. co‐expression) to assign a likelihood for each predicted interaction. This, combined with a structural BLAST approach significantly expands the range of applications of protein structure in the annotation of protein function, including systems level biological applications where it has previously played little role.  相似文献   

14.
A low-resolution scoring function for the selection of native and near-native structures from a set of predicted structures for a given protein sequence has been developed. The scoring function, ProVal (Protein Validate), used several variables that describe an aspect of protein structure for which the proximity to the native structure can be assessed quantitatively. Among the parameters included are a packing estimate, surface areas, and the contact order. A partial least squares for latent variables (PLS) model was built for each candidate set of the 28 decoy sets of structures generated for 22 different proteins using the described parameters as independent variables. The C(alpha) RMS of the candidate structures versus the experimental structure was used as the dependent variable. The final generalized scoring function was an average of all models derived, ensuring that the function was not optimized for specific fold classes or method of structure generation of the candidate folds. The results show that the crystal structure was scored best in 64% of the 28 test sets and was clearly separated from the decoys in many examples. In all the other cases in which the crystal structure did not rank first, it ranked within the top 10%. Thus, although ProVal could not distinguish between predicted structures that were similar overall in fold quality due to its inherently low resolution, it can clearly be used as a primary filter to eliminate approximately 90% of fold candidates generated by current prediction methods from all-atom modeling and further evaluation. The correlation between the predicted and actual C(alpha) RMS values varies considerably between the candidate fold sets.  相似文献   

15.
MOTIVATION: Remote homology detection is among the most intensively researched problems in bioinformatics. Currently discriminative approaches, especially kernel-based methods, provide the most accurate results. However, kernel methods also show several drawbacks: in many cases prediction of new sequences is computationally expensive, often kernels lack an interpretable model for analysis of characteristic sequence features, and finally most approaches make use of so-called hyperparameters which complicate the application of methods across different datasets. RESULTS: We introduce a feature vector representation for protein sequences based on distances between short oligomers. The corresponding feature space arises from distance histograms for any possible pair of K-mers. Our distance-based approach shows important advantages in terms of computational speed while on common test data the prediction performance is highly competitive with state-of-the-art methods for protein remote homology detection. Furthermore the learnt model can easily be analyzed in terms of discriminative features and in contrast to other methods our representation does not require any tuning of kernel hyperparameters. AVAILABILITY: Normalized kernel matrices for the experimental setup can be downloaded at www.gobics.de/thomas. Matlab code for computing the kernel matrices is available upon request. CONTACT: thomas@gobics.de, peter@gobics.de.  相似文献   

16.
The C alpha backbones of the glucose isomerase molecules of Streptomyces rubiginosus and Arthrobacter have been determined by X-ray crystallography and compared. Each molecule is a tetramer of eight-stranded alpha/beta barrels, and the mode of association of the tetramers is identical in each case. The Arthrobacter electron density shows four additional amino acids at the carboxyl terminus. There is also an insertion of six amino acids at position 277, and two individual insertions at about positions 348 and 357 (numbering according to the Streptomyces structure). There is a close structural homology throughout the whole molecule, which is most accurate up to position 325. The r.m.s. displacement for 315 homologous C alpha positions up to this position is 0.92 A.  相似文献   

17.
Macromolecular assemblies play an important role in all cellular processes. While there has recently been significant progress in protein structure prediction based on deep learning, large protein complexes cannot be predicted with these approaches. The integrative structure modeling approach characterizes multi-subunit complexes by computational integration of data from fast and accessible experimental techniques. Crosslinking mass spectrometry is one such technique that provides spatial information about the proximity of crosslinked residues. One of the challenges in interpreting crosslinking datasets is designing a scoring function that, given a structure, can quantify how well it fits the data. Most approaches set an upper bound on the distance between Cα atoms of crosslinked residues and calculate a fraction of satisfied crosslinks. However, the distance spanned by the crosslinker greatly depends on the neighborhood of the crosslinked residues. Here, we design a deep learning model for predicting the optimal distance range for a crosslinked residue pair based on the structures of their neighborhoods. We find that our model can predict the distance range with the area under the receiver-operator curve of 0.86 and 0.7 for intra- and inter-protein crosslinks, respectively. Our deep scoring function can be used in a range of structure modeling applications.  相似文献   

18.
Small-angle X-ray scattering (SAXS) is a powerful method for obtaining quantitative structural information on the size and shape of proteins, and it is increasingly used in kinetic studies of folding and association reactions. In this minireview, we discuss recent developments in using SAXS to obtain structural information on the unfolded ensemble and early folding intermediates of proteins using continuous-flow mixing devices. Interfacing of these micromachined devices to SAXS beamlines has allowed access to the microsecond time regime. The experimental constraints in implementation of turbulence and laminar flow-based mixers with SAXS detection and a comparison of the two approaches are presented. Current improvements and future prospects of microsecond time-resolved SAXS and the synergy with ab initio structure prediction and molecular dynamics simulations are discussed.  相似文献   

19.
ASC2 structure has been well defined by 1141 NOE experimental restraints. The model consists of five alpha helices. alpha-Helices are connected by short random structure loops. The sole exception is the loop connecting helices 2 and 3, which has a 20-residue length. Folding generally agrees with the folding of recently published death domain structures in which alpha-helix structures have been reported. In spite of structural similarity, amino acid sequence homology with the most similar protein (ASC1) is just 64%. DD, DED, and CASP protein structures present six helices along their sequences; ASC2 presents 5 well-defined helices due to long distance restraints. However, a helical fragment was observed between amino acids 38 and 42 (representing helix 3) in the death domains when constructing the model.  相似文献   

20.
Chen J  Brooks CL 《Proteins》2007,67(4):922-930
Recent advances in efficient and accurate treatment of solvent with the generalized Born approximation (GB) have made it possible to substantially refine the protein structures generated by various prediction tools through detailed molecular dynamics simulations. As demonstrated in a recent CASPR experiment, improvement can be quite reliably achieved when the initial models are sufficiently close to the native basin (e.g., 3-4 A C(alpha) RMSD). A key element to effective refinement is to incorporate reliable structural information into the simulation protocol. Without intimate knowledge of the target and prediction protocol used to generate the initial structural models, it can be assumed that the regular secondary structure elements (helices and strands) and overall fold topology are largely correct to start with, such that the protocol limits itself to the scope of refinement and focuses the sampling in vicinity of the initial structure. The secondary structures can be enforced by dihedral restraints and the topology through structural contacts, implemented as either multiple pair-wise C(alpha) distance restraints or a single sidechain distance matrix restraint. The restraints are weakly imposed with flat-bottom potentials to allow sufficient flexibility for structural rearrangement. Refinement is further facilitated by enhanced sampling of advanced techniques such as the replica exchange method (REX). In general, for single domain proteins of small to medium sizes, 3-5 nanoseconds of REX/GB refinement simulations appear to be sufficient for reasonable convergence. Clustering of the resulting structural ensembles can yield refined models over 1.0 A closer to the native structure in C(alpha) RMSD. Substantial improvement of sidechain contacts and rotamer states can also be achieved in most cases. Additional improvement is possible with longer sampling and knowledge of the robust structural features in the initial models for a given prediction protocol. Nevertheless, limitations still exist in sampling as well as force field accuracy, manifested as difficulty in refinement of long and flexible loops.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号