首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
Xu Z  Zhang C  Liu S  Zhou Y 《Proteins》2006,63(4):961-966
Solvent accessibility, one of the key properties of amino acid residues in proteins, can be used to assist protein structure prediction. Various approaches such as neural network, support vector machines, probability profiles, information theory, Bayesian theory, logistic function, and multiple linear regression have been developed for solvent accessibility prediction. In this article, a much simpler quadratic programming method based on the buriability parameter set of amino acid residues is developed. The new method, called QBES (Quadratic programming and Buriability Energy function for Solvent accessibility prediction), is reasonably accurate for predicting the real value of solvent accessibility. By using a dataset of 30 proteins to optimize three parameters, the average correlation coefficients between the predicted and actual solvent accessibility are about 0.5 for all four independent test sets ranging from 126 to 513 proteins. The method is efficient. It takes only 20 min for a regular PC to obtain results of 30 proteins with an average length of 263 amino acids. Although the proposed method is less accurate than a few more sophisticated methods based on neural network or support vector machines, this is the first attempt to predict solvent accessibility by energy optimization with constraints. Possible improvements and other applications of the method are discussed.  相似文献   

2.
The goal of this work is to characterize structurally ambivalent fragments in proteins. We have searched the Protein Data Bank and identified all structurally ambivalent peptides (SAPs) of length five or greater that exist in two different backbone conformations. The SAPs were classified in five distinct categories based on their structure. We propose a novel index that provides a quantitative measure of conformational variability of a sequence fragment. It measures the context-dependent width of the distribution of (phi,xi) dihedral angles associated with each amino acid type. This index was used to analyze the local structural propensity of both SAPs and the sequence fragments contiguous to them. We also analyzed type-specific amino acid composition, solvent accessibility, and overall structural properties of SAPs and their sequence context. We show that each type of SAP has an unusual, type-specific amino acid composition and, as a result, simultaneous intrinsic preferences for two distinct types of backbone conformation. All types of SAPs have lower sequence complexity than average. Fragments that adopt helical conformation in one protein and sheet conformation in another have the lowest sequence complexity and are sampled from a relatively limited repertoire of possible residue combinations. A statistically significant difference between two distinct conformations of the same SAP is observed not only in the overall structural properties of proteins harboring the SAP but also in the properties of its flanking regions and in the pattern of solvent accessibility. These results have implications for protein design and structure prediction.  相似文献   

3.
In this work, we explore a novel method to broaden the scope of sequence-based predictions of solvent accessibility or accessible surface area (ASA) to the atomic level. All 167 heavy atoms from the 20 types of amino acid residues in proteins have been studied. An analysis of ASA distribution of these atomic groups in different proteins has been performed and rotamer-style libraries have been developed. We observe that the ASA of some atomic groups (e.g., backbone C and N atoms) can be estimated from the sequence environment within a mean absolute error of 2-3 angstroms(2). However, some side chain atoms such as CG in Pro, NH1 in Arg and NE2 in Gln show a strong variability making it more difficult to estimate their ASA from sequence environment. In general, the prediction of ASA becomes more difficult for atomic positions at the side chain extremities of long amino acid residues (aromatic side chain terminals being the exception). Several atomic groups are frequently exposed to solvent. Some of them have a bimodal distribution, suggesting two stable conformations in terms of their solvent exposure. More detailed understanding and prediction of solvent accessibility, i.e., at an atomic level is expected to help in bioinformatics approaches to structure prediction, functional relevance of atomic solvent accessibilities and other interaction analyses.  相似文献   

4.
5.
Beta-breakers: an aperiodic secondary structure   总被引:1,自引:0,他引:1  
We have studied the architecture of parallel beta-sheets in proteins and focused on the residues that initiate and terminate the beta-strands. These beta-breaker residues are at the origin of the kink between the beta-strand and the turn that precedes or follows it. beta-Breakers can be located automatically using a consensus approach based on algorithmic secondary structure assignment, solvent accessibility and backbone dihedral angles. These beta-breakers are conformationally homogeneous with respect to side-chain solvent accessibility and backbone dihedral angle profile. A sequence-structure correlation is noted: a restricted subset of amino acids is observed at these positions. Analysis of homologous protein sequences shows that these residues are more highly conserved than other residues in the loop. We conclude that beta-breakers are the structural analogs of the N and C-terminal caps of alpha-helices. The identification of this aperiodic substructure suggests a strategy for improving secondary structure prediction and may guide site-directed mutagenesis experiments.  相似文献   

6.
Characterization of the C-terminal ER membrane anchor of PTP1B   总被引:1,自引:0,他引:1  
The tyrosine phosphatase PTP1B is an important regulator of cell function. In living cells PTP1B activity is restricted to the vicinity of the endoplasmic reticulum (ER) by post-translational C-terminal attachment of PTP1B to the ER membrane network. In our study we investigated the membrane anchor of PTP1B by use of EGFP fusion proteins. We demonstrate that the membrane anchor of PTP1B cannot be narrowed down to a unique amino acid sequence with a defined start and stop point but rather is moveable within several amino acids. Removal of up to seven amino acids from the C-terminus, as well as exchange of single amino acids in the putative transmembrane sequence did not influence subcellular localization of PTP1B. With the method of bimolecular fluorescence complementation we could demonstrate dimerization of PTP1B in vivo. Homodimerization was, in contrast to other tail-anchored proteins, not dependent on the membrane anchor. Our data demonstrate that the C-terminal membrane anchor of PTP1B is formed by a combination of a single stretch transmembrane domain (TMD) followed by a tail. TMD and tail length are variable and there are no sequence-specific features. Our data for PTP1B are consistent with a concept that explains the ER membrane anchor of tail-anchored proteins as a physicochemical structure.  相似文献   

7.
Pei J  Grishin NV 《Proteins》2004,56(4):782-794
We study the effects of various factors in representing and combining evolutionary and structural information for local protein structural prediction based on fragment selection. We prepare databases of fragments from a set of non-redundant protein domains. For each fragment, evolutionary information is derived from homologous sequences and represented as estimated effective counts and frequencies of amino acids (evolutionary frequencies) at each position. Position-specific amino acid preferences called structural frequencies are derived from statistical analysis of discrete local structural environments in database structures. Our method for local structure prediction is based on ranking and selecting database fragments that are most similar to a target fragment. Using secondary structure type as a local structural property, we test our method in a number of settings. The major findings are: (1) the COMPASS-type scoring function for fragment similarity comparison gives better prediction accuracy than three other tested scoring functions for profile-profile comparison. We show that the COMPASS-type scoring function can be derived both in the probabilistic framework and in the framework of statistical potentials. (2) Using the evolutionary frequencies of database fragments gives better prediction accuracy than using structural frequencies. (3) Finer definition of local environments, such as including more side-chain solvent accessibility classes and considering the backbone conformations of neighboring residues, gives increasingly better prediction accuracy using structural frequencies. (4) Combining evolutionary and structural frequencies of database fragments, either in a linear fashion or using a pseudocount mixture formula, results in improvement of prediction accuracy. Combination at the log-odds score level is not as effective as combination at the frequency level. This suggests that there might be better ways of combining sequence and structural information than the commonly used linear combination of log-odds scores. Our method of fragment selection and frequency combination gives reasonable results of secondary structure prediction tested on 56 CASP5 targets (average SOV score 0.77), suggesting that it is a valid method for local protein structure prediction. Mixture of predicted structural frequencies and evolutionary frequencies improve the quality of local profile-to-profile alignment by COMPASS.  相似文献   

8.
We have developed an all-atom free-energy force field (PFF01) for protein tertiary structure prediction. PFF01 is based on physical interactions and was parameterized using experimental structures of a family of proteins believed to span a wide variety of possible folds. It contains empirical, although sequence-independent terms for hydrogen bonding. Its solvent-accessible surface area solvent model was first fit to transfer energies of small peptides. The parameters of the solvent model were then further optimized to stabilize the native structure of a single protein, the autonomously folding villin headpiece, against competing low-energy decoys. Here we validate the force field for five nonhomologous helical proteins with 20-60 amino acids. For each protein, decoys with 2-3 A backbone root mean-square deviation and correct experimental Cbeta-Cbeta distance constraints emerge as those with the lowest energy.  相似文献   

9.
Kaleel  Manaz  Torrisi  Mirko  Mooney  Catherine  Pollastri  Gianluca 《Amino acids》2019,51(9):1289-1296

Predicting the three-dimensional structure of proteins is a long-standing challenge of computational biology, as the structure (or lack of a rigid structure) is well known to determine a protein’s function. Predicting relative solvent accessibility (RSA) of amino acids within a protein is a significant step towards resolving the protein structure prediction challenge especially in cases in which structural information about a protein is not available by homology transfer. Today, arguably the core of the most powerful prediction methods for predicting RSA and other structural features of proteins is some form of deep learning, and all the state-of-the-art protein structure prediction tools rely on some machine learning algorithm. In this article we present a deep neural network architecture composed of stacks of bidirectional recurrent neural networks and convolutional layers which is capable of mining information from long-range interactions within a protein sequence and apply it to the prediction of protein RSA using a novel encoding method that we shall call “clipped”. The final system we present, PaleAle 5.0, which is available as a public server, predicts RSA into two, three and four classes at an accuracy exceeding 80% in two classes, surpassing the performances of all the other predictors we have benchmarked.

  相似文献   

10.
11.
Prediction of protein surface accessibility with information theory   总被引:8,自引:0,他引:8  
A new, simple method based on information theory is introduced to predict the solvent accessibility of amino acid residues in various states defined by their different thresholds. Prediction is achieved by the application of information obtained from a single amino acid position or pair-information for a window of seventeen amino acids around the desired residue. Results obtained by pairwise information values are better than results from single amino acids. This reinforces the effect of the local environment on the accessibility of amino acid residues. The prediction accuracy of this method in a jackknife test system for two and three states is better than 70 and 60 %, respectively. A comparison of the results with those reported by others involving the same data set also testifies to a better prediction accuracy in our case.  相似文献   

12.
FUGUE, a program for recognizing distant homologues by sequence-structure comparison (http://www-cryst.bioc.cam.ac.uk/fugue/), has three key features. (1) Improved environment-specific substitution tables. Substitutions of an amino acid in a protein structure are constrained by its local structural environment, which can be defined in terms of secondary structure, solvent accessibility, and hydrogen bonding status. The environment-specific substitution tables have been derived from structural alignments in the HOMSTRAD database (http://www-cryst.bioc. cam.ac.uk/homstrad/). (2) Automatic selection of alignment algorithm with detailed structure-dependent gap penalties. FUGUE uses the global-local algorithm to align a sequence-structure pair when they greatly differ in length and uses the global algorithm in other cases. The gap penalty at each position of the structure is determined according to its solvent accessibility, its position relative to the secondary structure elements (SSEs) and the conservation of the SSEs. (3) Combined information from both multiple sequences and multiple structures. FUGUE is designed to align multiple sequences against multiple structures to enrich the conservation/variation information. We demonstrate that the combination of these three key features implemented in FUGUE improves both homology recognition performance and alignment accuracy.  相似文献   

13.
The structure of human interleukin 4 (IL-4) was predicted utilizing a series of experimental and theoretical techniques. Circular Dichroism (CD) spectroscopy indicated that IL-4 belonged to the all alpha-helix class of protein structures. Secondary structure prediction, site-directed mutagenesis, and CD spectroscopy suggested a predominantly alpha-helical structure, consistent with a four-helix bundle structural motif. A human/mouse IL-4 chimera was constructed to qualitatively evaluate alternative secondary structure predictions. The four predicted helices were assembled into tertiary structures using established algorithms. The mapping of three disulfide bridges in IL-4 provided additional constraints on possible tertiary structures. Using accessible surface contact area as a criterion, the most suitable structures were right handed all antiparallel four-helix bundles with two overhand loop connections. Successful loop closure and incorporation of the three disulfide constraints were possible while maintaining the expected shape, solvent accessibility, and steric interactions between loops and helices. Lastly, energy minimization was used to regularize the chain.  相似文献   

14.
To understand how protein segments are inserted and deleted during divergent evolution, a set of pairwise alignments contained exactly one gap, and therefore arising from the first insertion-deletion (indel) event in the time separating the homologs, was examined. The alignments showed that "structure breaking" amino acids (PGDNS) were preferred within and flanking gapped regions, as are two residues with hydrophilic side-chains (QE) that frequently occur at the surface of protein folds. Conversely, hydrophobic residues (FMILYVW) occur infrequently within and flanking the gapped region. These preferences are modestly different in protein pairs separated by an episode of adaptive evolution, than in pairs diverging under strong functional constraints. Surprisingly, regions near an indel have not evolved more rapidly than the sequence pair overall, showing no evidence that an indel event must be compensated by local amino acid replacement. The gap-lengths are best approximated by a Zipfian distribution, with the probability of a gap of length L decreasing as a function of L(-1.8). These features are largely independent of the length of the gap and the extent of divergence (measured by both silent and non-silent sequence changes) separating the two proteins. Surprisingly, amino acid repeats were discovered in more than a third of the polypeptide segments in and around the gap. These correspond to repeats in the DNA sequence. This suggests that a signature of the mechanism by which indels occur in the DNA sequence remains in the encoded protein sequences. These data suggest specific tools to score gap placement in an alignment. They also suggest tools that distinguish true indels from gaps created by mistaken gene finding, including under-predicted and over-predicted introns. By providing mechanisms to identify errors, the tools will enhance the value of genome sequence databases in support of integrated paleogenomics strategies used to extract functional information in a post-genomic environment.  相似文献   

15.
Vertebrates express two families of gap junction proteins: the well-characterized connexins and the pannexins. In contrast to connexins, pannexins do not appear to form gap junction channels but instead function as unpaired membrane channels. Pannexins have no sequence homology to connexins but are distantly related to the invertebrate gap junction proteins, innexins. Despite the sequence diversity, pannexins and connexins form channels with similar permeability properties and exhibit similar membrane topology, with two extracellular loops, four transmembrane (TM) segments, and cytoplasmic localization of amino and carboxy termini. To test whether the similarities extend to the pore structure of the channels, pannexin 1 (Panx1) was subjected to analysis with the substituted cysteine accessibility method (SCAM). The thiol reagents maleimidobutyryl-biocytin and 2-trimethylammonioethyl-methanethiosulfonate reacted with several cysteines positioned in the external portion of the first TM segment (TM1) and the first extracellular loop. These data suggest that portions of TM1 and the first extracellular loop line the outer part of the pore of Panx1 channels. In this aspect, the pore structures of Panx1 and connexin channels are similar. However, although the inner part of the pore is lined by amino-terminal amino acids in connexin channels, thiol modification was detected in carboxyterminal amino acids in Panx1 channels by SCAM analysis. Thus, it appears that the inner portion of the pores of Panx1 and connexin channels may be distinct.  相似文献   

16.
MOTIVATION: The solvent accessibility of amino acid residues plays an important role in tertiary structure prediction, especially in the absence of significant sequence similarity of a query protein to those with known structures. The prediction of solvent accessibility is less accurate than secondary structure prediction in spite of improvements in recent researches. The k-nearest neighbor method, a simple but powerful classification algorithm, has never been applied to the prediction of solvent accessibility, although it has been used frequently for the classification of biological and medical data. RESULTS: We applied the fuzzy k-nearest neighbor method to the solvent accessibility prediction, using PSI-BLAST profiles as feature vectors, and achieved high prediction accuracies. With leave-one-out cross-validation on the ASTRAL SCOP reference dataset constructed by sequence clustering, our method achieved 64.1% accuracy for a 3-state (buried/intermediate/exposed) prediction (thresholds of 9% for buried/intermediate and 36% for intermediate/exposed) and 86.7, 82.0, 79.0 and 78.5% accuracies for 2-state (buried/exposed) predictions (thresholds of each 0, 5, 16 and 25% for buried/exposed), respectively. Our method also showed slightly better accuracies than other methods by about 2-5% on the RS126 dataset and a benchmarking dataset with 229 proteins. AVAILABILITY: Program and datasets are available at http://biocom1.ssu.ac.kr/FKNNacc/ CONTACT: jul@ssu.ac.kr.  相似文献   

17.
The neutral theory of molecular evolution predicts that variation within species is inversely related to the strength of purifying selection, but the strength of purifying selection itself must be related to physical constraints imposed by protein folding and function. In this paper, we analyzed five enzymes for which polymorphic sequence variation within Escherichia coli and/or Salmonella enterica was available, along with a protein structure. Single and multivariate logistic regression models are presented that evaluate amino acid size, physicochemical properties, solvent accessibility, and secondary structure as predictors of polymorphism. A model that contains a positive coefficient of association between polymorphism and solvent accessibility and separate intercepts for each secondary-structure element is sufficient to explain the observed variation in polymorphism between sites. The model predicts an increase in the probability of amino acid polymorphism with increasing solvent accessibility for each protein regardless of physicochemical properties, secondary-structure element, or size of the amino acid. This result, when compared with the distribution of synonymous polymorphism, which shows no association with solvent accessibility, suggests a strong decrease in purifying selection with increasing solvent accessibility.  相似文献   

18.
Thermosensation is mediated by ion channels that are highly temperature-sensitive. Several members of the family of transient receptor potential (TRP) ion channels are activated by cold or hot temperatures and have been shown to function as temperature sensors in vivo. The molecular mechanism of temperature-sensitivity of these ion channels is not understood. A number of domains or even single amino acids that regulate temperature-sensitivity have been identified in several TRP channels. However, it is unclear what precise conformational changes occur upon temperature activation. Here, we used the cysteine accessibility method to probe temperature-dependent conformations of single amino acids in TRP channels. We screened over 50 amino acids in the predicted outer pore domains of the heat-activated ion channels TRPV1 and TRPV3. In both ion channels we found residues that have temperature-dependent accessibilities to the extracellular solvent. The identified residues are located within the second predicted extracellular pore loop. These residues are identical or proximal to residues that were shown to be specifically required for temperature-activation, but not chemical activation. Our data precisely locate conformational changes upon temperature-activation within the outer pore domain. Collectively, this suggests that these specific residues and the second predicted pore loop in general are crucial for the temperature-activation mechanism of these heat-activated thermoTRPs.  相似文献   

19.
Structure of bovine prothrombin fragment 1 refined at 2.25 A resolution.   总被引:4,自引:0,他引:4  
The structure of bovine prothrombin fragment 1 has been refined at 2.25 A resolution using high resolution measurements made with the synchrotron beam at CHESS. The synchrotron data were collected photographically by oscillation methods (R-merge = 0.08). These were combined with lower order diffractometer data for refinement purposes. The structure was refined using restrained least-squares methods with the program PROLSQ to a crystallographic R-value of 0.175. The structure includes 105 water molecules with occupancies of greater than 0.6. The first 35 residues (Ala1-Leu35) of the N-terminal gamma-carboxy glutamic acid-domain (Ala1-Cys48) of fragment 1 are disordered as are two carbohydrate chains of Mr approximately 5000; the latter two combine to render 40% of the structure disordered. The folding of the kringle of fragment 1 is related to the close intramolecular contact between the inner loop disulfide groups. Half of the conserved sequence of the kringle forms an inner core surrounding these disulfide groups. The remainder of the sequence conservation is associated with the many turns of the main chain. The Pro95 residue of the kringle has a cis conformation and Tyr74 is ordered in fragment 1, although nuclear magnetic resonance studies indicate that the comparable residue of plasminogen kringle 4 has two positions. Surface accessibility calculations indicate that none of the disulfide groups of fragment 1 is accessible to solvent.  相似文献   

20.
The theoretical three-dimensional structure of a novel δ-endotoxin Cry1Id (81 kDa) belonging to Cry1I class, toxic to many of the lepidopteran pests has been investigated through comparative modeling. Molecular dynamics (MD) simulations was carried out to characterize its structural and dynamical features at 10 ns in explicit solvent using the GROMACS version 4.5.4. Finally the simulated model was validated by the SAVES, WHAT IF, MetaMQAP, ProQ, ModFOLD and MolProbity servers. Despite low sequence identity with its structural homologs, Cry1Id not only resembles the previously reported Cry structures but also shares the common five conserved blocks of amino acid residues. Although the domain II of Cry1Id superpose well with its closest structural homolog Cry8Ea1, variation of amino acids and length in the apical loop2 of domain II was observed. In this work, we have hypothesized that the variations in apical loop2 might be the sole factor for providing variable surface accessibility to Cry1Id protein that could be important in receptor recognition. MD simulation showed the proposed endotoxin retains its stable conformation in aqueous solution. The result from this study is expected to aid in the development hybrid Cry proteins and new potent fusion proteins with novel specificities against different insect pests for improved pest management of crop plants.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号