首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
PredAcc: prediction of solvent accessibility   总被引:2,自引:0,他引:2  
SUMMARY: PredAcc is a tool for predicting the solvent accessibility of protein residues from the sequence at different relative accessibility levels (0-55%). The prediction rate varies between 70. 7% (for 25% relative accessibility) and 85.7% (for 0% relative accessibility). Amino acids are predicted in four categories: almost certainly hidden and almost certainly exposed with a given a posteriori prediction error, probably hidden and probably exposed otherwise. AVAILABILITY: http://condor.urbb.jussieu.fr/PredAccCfg.html CONTACT: tuffery@urbb.jussieu.fr  相似文献   

2.
In this work, we explore a novel method to broaden the scope of sequence-based predictions of solvent accessibility or accessible surface area (ASA) to the atomic level. All 167 heavy atoms from the 20 types of amino acid residues in proteins have been studied. An analysis of ASA distribution of these atomic groups in different proteins has been performed and rotamer-style libraries have been developed. We observe that the ASA of some atomic groups (e.g., backbone C and N atoms) can be estimated from the sequence environment within a mean absolute error of 2-3 angstroms(2). However, some side chain atoms such as CG in Pro, NH1 in Arg and NE2 in Gln show a strong variability making it more difficult to estimate their ASA from sequence environment. In general, the prediction of ASA becomes more difficult for atomic positions at the side chain extremities of long amino acid residues (aromatic side chain terminals being the exception). Several atomic groups are frequently exposed to solvent. Some of them have a bimodal distribution, suggesting two stable conformations in terms of their solvent exposure. More detailed understanding and prediction of solvent accessibility, i.e., at an atomic level is expected to help in bioinformatics approaches to structure prediction, functional relevance of atomic solvent accessibilities and other interaction analyses.  相似文献   

3.
A simple method of predicting residue solvent accessibilities in proteins is described, with the intention that it should be used as a baseline by which more sophisticated approaches to prediction can be judged. Comparison with existing methods of predicting residue burial reveals that their performance is often little better than that of the baseline method. The problem of comparing different prediction methods is shown to be complicated by the proliferation of different schemes for classifying residue burial.  相似文献   

4.
Kaleel  Manaz  Torrisi  Mirko  Mooney  Catherine  Pollastri  Gianluca 《Amino acids》2019,51(9):1289-1296

Predicting the three-dimensional structure of proteins is a long-standing challenge of computational biology, as the structure (or lack of a rigid structure) is well known to determine a protein’s function. Predicting relative solvent accessibility (RSA) of amino acids within a protein is a significant step towards resolving the protein structure prediction challenge especially in cases in which structural information about a protein is not available by homology transfer. Today, arguably the core of the most powerful prediction methods for predicting RSA and other structural features of proteins is some form of deep learning, and all the state-of-the-art protein structure prediction tools rely on some machine learning algorithm. In this article we present a deep neural network architecture composed of stacks of bidirectional recurrent neural networks and convolutional layers which is capable of mining information from long-range interactions within a protein sequence and apply it to the prediction of protein RSA using a novel encoding method that we shall call “clipped”. The final system we present, PaleAle 5.0, which is available as a public server, predicts RSA into two, three and four classes at an accuracy exceeding 80% in two classes, surpassing the performances of all the other predictors we have benchmarked.

  相似文献   

5.
NETASA: neural network based prediction of solvent accessibility   总被引:3,自引:0,他引:3  
MOTIVATION: Prediction of the tertiary structure of a protein from its amino acid sequence is one of the most important problems in molecular biology. The successful prediction of solvent accessibility will be very helpful to achieve this goal. In the present work, we have implemented a server, NETASA for predicting solvent accessibility of amino acids using our newly optimized neural network algorithm. Several new features in the neural network architecture and training method have been introduced, and the network learns faster to provide accuracy values, which are comparable or better than other methods of ASA prediction. RESULTS: Prediction in two and three state classification systems with several thresholds are provided. Our prediction method achieved the accuracy level upto 90% for training and 88% for test data sets. Three state prediction results provide a maximum 65% accuracy for training and 63% for the test data. Applicability of neural networks for ASA prediction has been confirmed with a larger data set and wider range of state thresholds. Salient differences between a linear and exponential network for ASA prediction have been analysed. AVAILABILITY: Online predictions are freely available at: http://www.netasa.org. Linux ix86 binaries of the program written for this work may be obtained by email from the corresponding author.  相似文献   

6.
Li X  Pan XM 《Proteins》2001,42(1):1-5
A novel method was developed for predicting the solvent accessibility. Based on single sequence data, this method achieved 71.5% accuracy with a correlation coefficient of 0.42 in a database of 704 proteins with threshold of 20% for a two-state-defining solvent accessibility. Prediction in a data subset of 341 monomeric proteins achieved 72.7% accuracy with a correlation coefficient of 0. 43. On the average, prediction over short chains gives better results than that over long chains. With a solvent accessibility threshold of 20%, prediction over 236 monomeric proteins with chain length < 300 amino acid residues achieved 75.3% accuracy with a correlation coefficient of 0.44 by jackknife analysis, which is higher than that obtained by previous methods using multiple sequence alignments.  相似文献   

7.

Background  

It is well known that most of the binding free energy of protein interaction is contributed by a few key hot spot residues. These residues are crucial for understanding the function of proteins and studying their interactions. Experimental hot spots detection methods such as alanine scanning mutagenesis are not applicable on a large scale since they are time consuming and expensive. Therefore, reliable and efficient computational methods for identifying hot spots are greatly desired and urgently required.  相似文献   

8.

Background  

Structural properties of proteins such as secondary structure and solvent accessibility contribute to three-dimensional structure prediction, not only in the ab initio case but also when homology information to known structures is available. Structural properties are also routinely used in protein analysis even when homology is available, largely because homology modelling is lower throughput than, say, secondary structure prediction. Nonetheless, predictors of secondary structure and solvent accessibility are virtually always ab initio.  相似文献   

9.
Prediction of protein structure from its amino acid sequence is still a challenging problem. The complete physicochemical understanding of protein folding is essential for the accurate structure prediction. Knowledge of residue solvent accessibility gives useful insights into protein structure prediction and function prediction. In this work, we propose a random forest method, RSARF, to predict residue accessible surface area from protein sequence information. The training and testing was performed using 120 proteins containing 22006 residues. For each residue, buried and exposed state was computed using five thresholds (0%, 5%, 10%, 25%, and 50%). The prediction accuracy for 0%, 5%, 10%, 25%, and 50% thresholds are 72.9%, 78.25%, 78.12%, 77.57% and 72.07% respectively. Further, comparison of RSARF with other methods using a benchmark dataset containing 20 proteins shows that our approach is useful for prediction of residue solvent accessibility from protein sequence without using structural information. The RSARF program, datasets and supplementary data are available at http://caps.ncbs.res.in/download/pugal/RSARF/.  相似文献   

10.
The human plasma protein transthyretin (TTR) may form fibrillar protein deposits that are associated with both inherited and idiopathic amyloidosis. The present study utilizes solution nuclear magnetic resonance spectroscopy, in combination with hydrogen/deuterium exchange, to determine residue-specific solvent protection factors within the fibrillar structure of the clinically relevant variant, TTRY114C. This novel approach suggests a fibril core comprised of the six beta-strands, A-B-E-F-G-H, which retains a native-like conformation. Strands C and D are dislocated from their native edge region and become solvent-exposed, leaving a new interface involving strands A and B open for intermolecular interactions. Our results further support a native-like intermolecular association between strands F-F' and H-H' with a prolongation of these beta-strands and, interestingly, with a possible shift in beta-strand register of the subunit assembly. This finding may explain previous observations of a monomeric intermediate preceding fibril formation. A structural model based on our results is presented.  相似文献   

11.
Acrylamide quenching is widely used to monitor the solvent exposure of fluorescent probes in vitro. Here, we tested the utility of this technique to discriminate local RNA secondary structures using the fluorescent adenine analogue 2-aminopurine (2-AP). Under native conditions, the solvent accessibilities of most 2-AP-labeled RNA substrates were poorly resolved by classical single-population models; rather, a two-state quencher accessibility algorithm was required to model acrylamide-dependent changes in 2-AP fluorescence in structured RNA contexts. Comparing 2-AP quenching parameters between structured and unstructured RNA substrates permitted the effects of local RNA structure on 2-AP solvent exposure to be distinguished from nearest neighbor effects or environmental influences on intrinsic 2-AP photophysics. Using this strategy, the fractional accessibility of 2-AP for acrylamide ( f a) was found to be highly sensitive to local RNA structure. Base-paired 2-AP exhibited relatively poor accessibility, consistent with extensive shielding by adjacent bases. 2-AP in a single-base bulge was uniformly accessible to solvent, whereas the fractional accessibility of 2-AP in a hexanucleotide loop was indistinguishable from that of an unstructured RNA. However, these studies also provided evidence that the f a parameter reflects local conformational dynamics in base-paired RNA. Enhanced base pair dynamics at elevated temperatures were accompanied by increased f a values, while restricting local RNA breathing by adding a C-G base pair clamp or positioning 2-AP within extended RNA duplexes significantly decreased this parameter. Together, these studies show that 2-AP quenching studies can reveal local RNA structural and dynamic features beyond those that can be measured by conventional spectroscopic approaches.  相似文献   

12.
One of the most successful drug targets against AIDS in the last decade has been the HIV-1 protease (HIV-1 PR), an enzyme that processes the polyprotein gene products into active replicative viral proteins. In our quest for a wide-ranging, binding free energy function we have extended the solvent accessibility free energy predictor (SAFE_p) method, recently developed for peptidic HIV-1 PR inhibitors, to the study of the binding of cyclic urea (CU) HIV-1 PR inhibitors. Our results show that there is a need for a specific term depicting polar contacts to be added to the original SAFE_p analytical expression, an outcome not seen in our studies of HIV-1 PR peptidic inhibitors. Nevertheless, despite the higher profile of the electrostatic interactions in the binding of the CU inhibitors, our analysis indicates that CU inhibitor binding is still driven by the hydrophobic entropic contribution, as much as for the peptidic inhibitors.  相似文献   

13.
Considering accessibility of the 3′UTR is believed to increase the precision of microRNA target predictions. We show that, contrary to common belief, ranking by the hybridization energy or by the sum of the opening and hybridization energies, used in currently available algorithms, is not an efficient way to rank predictions. Instead, we describe an algorithm which also considers only the accessible binding sites but which ranks predictions according to over-representation. When compared with experimentally validated and refuted targets in the fruit fly and human, our algorithm shows a remarkable improvement in precision while significantly reducing the computational cost in comparison with other free energy based methods. In the human genome, our algorithm has at least twice higher precision than other methods with their default parameters. In the fruit fly, we find five times more validated targets among the top 500 predictions than other methods with their default parameters. Furthermore, using a common statistical framework we demonstrate explicitly the advantages of using the canonical ensemble instead of using the minimum free energy structure alone. We also find that ‘naïve’ global folding sometimes outperforms the local folding approach.  相似文献   

14.
15.
The solvent accessibility of each residue is predicted on the basis of the protein sequence. A set of 338 monomeric, non-homologous and high-resolution protein crystal structures is used as a learning set and a jackknife procedure is applied to each entry. The prediction is based on the comparison of the observed and the average values of the solvent-accessible area. It appears that the prediction accuracy is significantly improved by considering the residue types preceding and/or following the residue whose accessibility must be predicted. In contrast, the separate treatment of different secondary structural types does not improve the quality of the prediction. It is furthermore shown that the residue accessibility is much better predicted in small than in larger proteins. Such a discrepancy must be carefully considered in any algorithm for predicting residue accessibility.  相似文献   

16.
We present an approach for incorporating solvent accessibility data from electron paramagnetic resonance experiments in the structural refinement of membrane proteins through restrained molecular dynamics simulations. The restraints have been parameterized from oxygen (ΠO2) and nickel-ethylenediaminediacetic acid (ΠNiEdda) collision frequencies, as indicators of lipid or aqueous exposed spin-label sites. These are enforced through interactions between a pseudoatom representation of the covalently attached Nitroxide spin-label and virtual “solvent” particles corresponding to O2 and NiEdda in the surrounding environment. Interactions were computed using an empirical potential function, where the parameters have been optimized to account for the different accessibilities of the spin-label pseudoatoms to the surrounding environment. This approach, “pseudoatom-driven solvent accessibility refinement”, was validated by refolding distorted conformations of the Streptomyces lividans potassium channel (KcsA), corresponding to a range of 2-30 Å root mean-square deviations away from the native structure. Molecular dynamics simulations based on up to 58 electron paramagnetic resonance restraints derived from spin-label mutants were able to converge toward the native structure within 1-3 Å root mean-square deviations with minimal computational cost. The use of energy-based ranking and structure similarity clustering as selection criteria helped in the convergence and identification of correctly folded structures from a large number of simulations. This approach can be applied to a variety of integral membrane protein systems, regardless of oligomeric state, and should be particularly useful in calculating conformational changes from a known reference crystal structure.  相似文献   

17.
18.
Myoglobin structure and regulation of solvent accessibility of heme pocket   总被引:1,自引:0,他引:1  
The effects of heme removal on the molecular structure of tuna and sperm whale myoglobin have been investigated by comparing the solvent accessibility to the heme pocket of the two proteins with that of the corresponding apoproteins. Although the heme microenvironment of tuna myoglobin is more polar than that of sperm whale myoglobin, the accessibility of solvent to heme is identical in the two proteins as revealed by thermal perturbation of Soret absorption. The removal of heme produces loss of helical folding and increase of solvent accessibility but the effects are rather different for the two proteins. More precisely, the loss of helical structure upon heme removal is 50% for tuna myoglobin and 15% for sperm whale myoglobin; moreover, the solvent accessibility of the heme pocket of tuna apomyoglobin is 2-3-fold greater than that of sperm whale apomyoglobin. These results have been explained in terms of the lack of helical folding in segment D, the structural organization of which may have a relevant effect in regulating the accessibility of ligands to the heme. The effects produced by charged quenchers reveal that the ligand path from the surface of the molecule to the ion atom of the heme involves a positively charged residue which may reasonably be identified as Arg-45 (sperm whale myoglobin) or Lys-41 (tuna myoglobin) on the basis of recent X-ray crystallographic information.  相似文献   

19.
ABSTRACT: BACKGROUND: Protein structure mediates site-specific patterns of sequence divergence. In particular, residues in the core of a protein (solvent-inaccessible residues) tend to be more evolutionarily conserved than residues on the surface (solvent-accessible residues). RESULTS: Here, we present a model of sequence evolution that explicitly accounts for the relative solvent accessibility of each residue in a protein. Our model is a variant of the Goldman-Yang 1994 (GY94) model in which all model parameters can be functions of the relative solvent accessibility (RSA) of a residue. We apply this model to a data set comprised of nearly 600 yeast genes, and find that an evolutionary-rate ratio omega that varies linearly with RSA provides a better model fit than an RSA-independent omega or an omega that is estimated separately in individual RSA bins. We further show that the branch length t and the transition--transverion ratio kappa also vary with RSA. The RSA-dependent GY94 model performs better than an RSA-dependent Muse-Gaut 1994 (MG94) model in which the synonymous and non-synonymous rates individually are linear functions of RSA. Finally, protein core size affects the slope of the linear relationship between omega and RSA, and gene expression level affects both the intercept and the slope. CONCLUSIONS: Structure-aware models of sequence evolution provide a significantly better fit than traditional models that neglect structure. The linear relationship between omega and RSA implies that genes are better characterized by their omega slope and intercept than by just their mean omega.  相似文献   

20.
In this study, we propose a novel method to predict the solvent accessible surface areas of transmembrane residues. For both transmembrane alpha-helix and beta-barrel residues, the correlation coefficients between the predicted and observed accessible surface areas are around 0.65. On the basis of predicted accessible surface areas, residues exposed to the lipid environment or buried inside a protein can be identified by using certain cutoff thresholds. We have extensively examined our approach based on different definitions of accessible surface areas and a variety of sets of control parameters. Given that experimentally determining the structures of membrane proteins is very difficult and membrane proteins are actually abundant in nature, our approach is useful for theoretically modeling membrane protein tertiary structures, particularly for modeling the assembly of transmembrane domains. This approach can be used to annotate the membrane proteins in proteomes to provide extra structural and functional information.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号