共查询到20条相似文献,搜索用时 0 毫秒
1.
We designed a simple position-specific hidden Markov model to predict protein structure. Our new framework naturally repeats itself to converge to a final target, conglomerating fragment assembly, clustering, target selection, refinement, and consensus, all in one process. Our initial implementation of this theory converges to within 6 A of the native structures for 100% of decoys on all six standard benchmark proteins used in ROSETTA (discussed by Simons and colleagues in a recent paper), which achieved only 14%-94% for the same data. The qualities of the best decoys and the final decoys our theory converges to are also notably better. 相似文献
2.
Prediction of transmembrane (TM) segments of amino acid sequences of membrane proteins is a well-known and very important problem. The accuracy of its solution can be improved for approaches that do not use a homology search in an additional data bank. There is a lack of tested data in this area of research, because information on the structure of membrane proteins is scarce. In this work we created a test sample of structural alignments for membrane proteins. The TM segments of these proteins were mapped according to aligned 3D structures resolved for these proteins. A method for predicting TM segments in an alignment was developed on the basis of the forward-backward algorithm from the HMM theory. This method allows a user not only to predict TM segments, but also to create a probabilistic membrane profile, which can be employed in multiple alignment procedures taking the secondary structure of proteins into account. The method was implemented in a computer program available at http://bioinf.fbb.msu.ru/fwdbck/. It provides better results than the MEMSAT method, which is nearly the only tool predicting TM segments in multiple alignments, without a homology search. 相似文献
3.
Qiwen?Dong "author-information "> "author-information__contact u-icon-before "> "mailto:qwdong@insun.hit.edu.cn " title= "qwdong@insun.hit.edu.cn " itemprop= "email " data-track= "click " data-track-action= "Email author " data-track-label= " ">Email author Xiaolong?Wang Lei?Lin Yi?Guan 《中国科学C辑(英文版)》2005,48(4):394-405
A novel method for predicting the secondary structures of proteins from amino acid sequence has been presented. The protein secondary structure seqlets that are analogous to the words in natural language have been extracted. These seqlets will capture the relationship between amino acid sequence and the secondary structures of proteins and further form the protein secondary structure dictionary. To be elaborate, the dictionary is organism-specific. Protein secondary structure prediction is formulated as an integrated word segmentation and part of speech tagging problem. The word-lattice is used to represent the results of the word segmentation and the maximum entropy model is used to calculate the probability of a seqlet tagged as a certain secondary structure type. The method is markovian in the seqlets, permitting efficient exact calculation of the posterior probability distribution over all possible word segmentations and their tags by viterbi algorithm. The optimal segmentations and their tags are computed as the results of protein secondary structure prediction. The method is applied to predict the secondary structures of proteins of four organisms respectively and compared with the PHD method. The results show that the performance of this method is higher than that of PHD by about 3.9% Q3 accuracy and 4.6% SOV accuracy. Combining with the local similarity protein sequences that are obtained by BLAST can give better prediction. The method is also tested on the 50 CASP5 target proteins with Q3 accuracy 78.9% and SOV accuracy 77.1%. A web server for protein secondary structure prediction has been constructed which is available at http://www.insun.hit.edu.cn:81/demos/biology/index.html. 相似文献
4.
5.
《Journal of molecular biology》2022,434(5):167407
Intrinsically disordered proteins (IDPs) are an important class of proteins which lack tertiary structure elements. Their dynamic properties can depend on reversible post-translational modifications and the complex cellular milieu, which provides a crowded environment. Both influences the thermodynamic stability and folding of globular proteins as well as the conformational plasticity of IDPs. Here we investigate the intrinsically disordered C-terminal region (amino acids 613–694) of human Grb2-associated binding protein 1 (Gab1), which binds to the disease-relevant Src homolog region 2 (SH2) domain-containing protein tyrosine phosphatase SHP2 (PTPN11). This binding is mediated by phosphorylation at Tyr 627 and Tyr 659 in Gab1. We characterize induced structure in Gab1613–694 and binding to SHP2 by NMR, CD and ITC under non-crowding and crowding conditions, employing chemical and biological crowding agents and compare the results of the non-phosphorylated and tyrosine phosphorylated C-terminal Gab1 fragment. Our results show that under crowding conditions pre-structured motifs in two distinct regions of Gab1 are formed whereas phosphorylation has no impact on the dynamics and IDP character. These structured regions are identical to the binding regions towards SHP2. Therefore, biological crowders could induce some SHP2 binding capacity. Our results therefore indicate that high concentrations of macromolecules stabilize the preformed or excited binding state in the C-terminal Gab1 region and foster the binding to the SH2 tandem motif of SHP2, even in the absence of tyrosine phosphorylation. 相似文献
6.
C. M. Stultz J. V. White T. F. Smith 《Protein science : a publication of the Protein Society》1993,2(3):305-314
A new method has been developed to compute the probability that each amino acid in a protein sequence is in a particular secondary structural element. Each of these probabilities is computed using the entire sequence and a set of predefined structural class models. This set of structural classes is patterned after Jane Richardson''s taxonomy for the domains of globular proteins. For each structural class considered, a mathematical model is constructed to represent constraints on the pattern of secondary structural elements characteristic of that class. These are stochastic models having discrete state spaces (referred to as hidden Markov models by researchers in signal processing and automatic speech recognition). Each model is a mathematical generator of amino acid sequences; the sequence under consideration is modeled as having been generated by one model in the set of candidates. The probability that each model generated the given sequence is computed using a filtering algorithm. The protein is then classified as belonging to the structural class having the most probable model. The secondary structure of the sequence is then analyzed using a \"smoothing\" algorithm that is optimal for that structural class model. For each residue position in the sequence, the smoother computes the probability that the residue is contained within each of the defined secondary structural elements of the model. This method has two important advantages: (1) the probability of each residue being in each of the modeled secondary structural elements is computed using the totality of the amino acid sequence, and (2) these probabilities are consistent with prior knowledge of realizable domain folds as encoded in each model. As an example of the method''s utility, we present its application to flavodoxin, a prototypical alpha/beta protein having a central beta-sheet, and to thioredoxin, which belongs to a similar structural class but shares no significant sequence similarity. 相似文献
7.
GAO Guanghua DAI Jixun DING Ming Goran HellekantWANG JinfengWANG Dacheng 《中国科学:生命科学英文版》1999,42(4):409-419
Brazzein is a sweet-tasting protein isolated from the fruit of West African plantPentadiplandra brazzeana Baillon. It is the smallest and the most water-soluble sweet protein discovered so far and is highly thermostable. The proton NMR study of brazzein at 600 MHz (pH 3.5, 300 K) is presented. The complete sequence specific assignments of the individual backbone and sidechain proton resonances were achieved using through-bond and through-space connectivities obtained from standard two-dimensional NMR techniques. The secondary structure of brazzein contains one alpha-helix (residues 21-29), one short 3(10)-helix (residues 14-17), two strands of antiparallel beta-sheet (residues 34-39, 44-50) and probably a third strand (residues 5-7) near the N-terminus. A comparative analysis found that brazzein shares a so-called 'cysteine-stabilized alpha-beta' (CSalphabeta) motif with scorpion neurotoxins, insect defensins and plant gamma - thionins. The significance of this multi-function motif, the possible active sites and the structural basis of themostability were discussed. 相似文献
8.
Brazzein is a sweet-tasting protein isolated from the fruit of West African plant Pentadiplandra brazzeana Baillon. It is the smallest and the most water-soluble sweet protein discovered so far and is highly thermostable. The proton NMR study of brazzein at 600 MHz (pH 3.5, 300 K) is presented. The complete sequence specific assignments of the individual backbone and sideehain proton resonances were achieved using through-bond and through-space eonneetivities obtained from standard two-dimensional NMR techniques. The secondary structure of brazzein contains one α-helix (residues 21—29), one short 3_(10)-helix (residues 14—17), two strands of antiparallel β-sheet (residues 34—39, 44—50) and probably a third strand (residues 5—7) near the N-terminus. A comparative analysis found that brazzein shares a so-called 'eysteine-stabilized alpha-beta' (CSαβ) motif with scorpion neurotoxins, insect defensins and plant γ-thionins. The significance of this multi-function motif, the possible active sites an 相似文献
9.
C. Sun A. Holmgren J. H. Bushweller 《Protein science : a publication of the Protein Society》1997,6(2):383-390
Human glutaredoxin is a member of the glutaredoxin family, which is characterized by a glutathione binding site and a redox-active dithiol/disulfide in the active site. Unlike Escherichia coli glutaredoxin-1, this protein has additional cysteine residues that have been suggested to play a regulatory role in its activity. Human glutaredoxin (106 amino acid residues, M(r) = 12,000) has been purified from a pET expression vector with both uniform 15N labeling and 13C/15N double labeling. The combination of three-dimensional 15N-edited TOCSY, 15N-edited NOESY, HNCA, HN(CO)CA, and gradient sensitivity-enhanced HNCACB and HNCO spectra were used to obtain sequential assignments for residues 2-106 of the protein. The gradient-enhanced version of the HCCH-TOCSY pulse sequence and HCCH-COSY were used to obtain side chain 1H and 13C assignments. The secondary structural elements in the reduced protein were identified based on NOE information, amide proton exchange data, and chemical shift index data. Human glutaredoxin contains five helices extending approximately from residues 4-10, 24-36, 53-64, 83-92, and 94-104. The secondary structure also shows four beta-strands comprised of residues 15-19, 43-48, 71-75, 78-80, which form a beta-sheet almost identical to that found in E. coli glutaredoxin-1. Complete 1H, 13C, and 15N assignments and the secondary structure of fully reduced human glutaredoxin are presented. Comparison to the structures of other glutaredoxins is presented and differences in the secondary structure elements are discussed. 相似文献
10.
M. T. Reymond G. Merutka H. J. Dyson P. E. Wright 《Protein science : a publication of the Protein Society》1997,6(3):706-716
Myoglobin has been studied extensively as a paradigm for protein folding. As part of an ongoing study of potential folding initiation sites in myoglobin, we have synthetized a series of peptides covering the entire sequence of sperm whale myoglobin. We report here on the conformation preferences of a series of peptides that cover the region from the A helix to the FG turn. Structural propensities were determined using circular dichroism and nuclear magnetic resonance spectroscopy in aqueous solution, trifluoroethanol, and methanol. Peptides corresponding to helical regions in the native protein, namely the B, C, D, and E helices, populate the alpha region of (phi, psi) space in water solution but show no measurable helix formation except in the presence of trifluoroethanol. The F-helix sequence has a much lower propensity to populate helical conformations even in TFE. Despite several attempts, we were not successful in synthesizing a peptide corresponding to the A-helix region that was soluble in water. A peptide termed the AB domain was constructed spanning the A- and B-helix sequences. The AB domain is not soluble in water, but shows extensive helix formation throughout the peptide when dissolved in methanol, with a break in the helix at a site close to the A-B helix junction in the intact folded myoglobin protein. With the exception of one local preference for a turn conformation stabilized by hydrophobic interactions, the peptides corresponding to turns in the folded protein do not measurably populate beta-turn conformations in water, and the addition of trifluoroethanol does not enhance the formation of either helical or turn structure. In contrast to the series of peptides described here, either studies of peptides from the GH region of myoglobin show a marked tendency to populate helical structures (H), nascent helical structures (G), or turn conformations (GH peptide) in water solution. This region, together with the A-helix and part of the B-helix, has been shown to participate in an early folding intermediate. The complete analysis of conformational properties of isolated myoglobin peptides supports the hypothesis that spontaneous secondary structure formation in local regions of the polypeptide may play an important role in the initiation of protein folding. 相似文献
11.
灵芝子实体中两个新的天然三萜类化学成分的分离、纯化和鉴定 总被引:1,自引:0,他引:1
本文采用硅胶和MCI柱层析的方法,从灵芝Ganodermalucidum子实体中分离纯化三萜类化合物。从灵芝子实体的氯仿萃取层中,分离纯化到灵芝属中的2个新天然产物,运用现代NMR技术分析确定了它们的结构,分别为methyl7β-hydroxy-3,11,15,23-tetraoxo-5α-lanost-8-en-26-oate(methylganoderateD)(Ⅰ)和methyl12β-acetoxy-3,7,11,15-tetraoxo-5α-lanost-8-en-24-oate(methyllucidenateD)(Ⅱ)。 相似文献
12.
A systematic study of helix-helix packing in a comprehensive database of protein structures revealed that the side chains inside helix-helix interfaces on average are shorter than those in the noninterface parts of the helices. The study follows our earlier study of this effect in transmembrane helices. The results obtained on the entire database of protein structures are consistent with those obtained on the transmembrane helices. The difference in the length of interface and noninterface side chains is small but statistically significant. It indicates that helices, if viewed along their main axis, statistically are not circular, but have a flattened interface. This effect brings the helices closer to each other and creates a tighter structural packing. The results provide an interesting insight into the aspects of protein structure and folding. 相似文献
13.
PsiCSI is a highly accurate and automated method of assigning secondary structure from NMR data, which is a useful intermediate step in the determination of tertiary structures. The method combines information from chemical shifts and protein sequence using three layers of neural networks. Training and testing was performed on a suite of 92 proteins (9437 residues) with known secondary and tertiary structure. Using a stringent cross-validation procedure in which the target and homologous proteins were removed from the databases used for training the neural networks, an average 89% Q3 accuracy (per residue) was observed. This is an increase of 6.2% and 5.5% (representing 36% and 33% fewer errors) over methods that use chemical shifts (CSI) or sequence information (Psipred) alone. In addition, PsiCSI improves upon the translation of chemical shift information to secondary structure (Q3 = 87.4%) and is able to use sequence information as an effective substitute for sparse NMR data (Q3 = 86.9% without (13)C shifts and Q3 = 86.8% with only H(alpha) shifts available). Finally, errors made by PsiCSI almost exclusively involve the interchange of helix or strand with coil and not helix with strand (<2.5 occurrences per 10000 residues). The automation, increased accuracy, absence of gross errors, and robustness with regards to sparse data make PsiCSI ideal for high-throughput applications, and should improve the effectiveness of hybrid NMR/de novo structure determination methods. A Web server is available for users to submit data and have the assignment returned. 相似文献
14.
The public archives containing protein information in the form of NMR chemical shift data at the BioMagResBank (BMRB) and of 3D structure coordinates at the Protein Data Bank are continuously expanding. The quality of the data contained in these archives, however, varies. The main issue for chemical shift values is that they are determined relative to a reference frequency. When this reference frequency is set incorrectly, all related chemical shift values are systematically offset. Such wrongly referenced chemical shift values, as well as other problems such as chemical shift values that are assigned to the wrong atom, are not easily distinguished from correct values and effectively reduce the usefulness of the archive. We describe a new method to correct and validate protein chemical shift values in relation to their 3D structure coordinates. This method classifies atoms using two parameters: the per‐atom solvent accessible surface area (as calculated from the coordinates) and the secondary structure of the parent amino acid. Through the use of Gaussian statistics based on a large database of 3220 BMRB entries, we obtain per‐entry chemical shift corrections as well as Z scores for the individual chemical shift values. In addition, information on the error of the correction value itself is available, and the method can retain only dependable correction values. We provide an online resource with chemical shift, atom exposure, and secondary structure information for all relevant BMRB entries ( http://www.ebi.ac.uk/pdbe/nmr/vasco ) and hope this data will aid the development of new chemical shift‐based methods in NMR. Proteins 2010. © 2010 Wiley‐Liss, Inc. 相似文献
15.
DONG Qiwen WANG Xiaolong LIN Lei & GUAN Yi School of Computer Science Technology Harbin Institute of Technology Harbin China 《中国科学:生命科学英文版》2005,48(4):394-405
1 Introduction The prediction of protein structure and function from amino acid sequences is one of the most impor-tant problems in molecular biology. This problem is becoming more pressing as the number of known pro-tein sequences is explored as a result of genome and other sequencing projects, and the protein sequence- structure gap is widening rapidly[1]. Therefore, com-putational tools to predict protein structures are needed to narrow the widening gap. Although the prediction of three dim… 相似文献
16.
Different programs and methods were employed to superimpose protein structures, using members of four very different protein families as test subjects, and the results of these efforts were compared. Algorithms based on human identification of key amino acid residues on which to base the superpositions were nearly always more successful than programs that used automated techniques to identify key residues. Among those programs automatically identifying key residues, MASS could not superimpose all members of some families, but was very efficient with other families. MODELLER, MultiProt, and STAMP had varying levels of success. A genetic algorithm program written for this project did not improve superpositions when results from neighbor-joining and pseudostar algorithms were used as its starting cases, but it always improved superpositions obained by MODELLER and STAMP. A program entitled PyMSS is presented that includes three superposition algorithms featuring human interaction. 相似文献
17.
A combined transmembrane topology and signal peptide prediction method 总被引:31,自引:0,他引:31
An inherent problem in transmembrane protein topology prediction and signal peptide prediction is the high similarity between the hydrophobic regions of a transmembrane helix and that of a signal peptide, leading to cross-reaction between the two types of predictions. To improve predictions further, it is therefore important to make a predictor that aims to discriminate between the two classes. In addition, topology information can be gained when successfully predicting a signal peptide leading a transmembrane protein since it dictates that the N terminus of the mature protein must be on the non-cytoplasmic side of the membrane. Here, we present Phobius, a combined transmembrane protein topology and signal peptide predictor. The predictor is based on a hidden Markov model (HMM) that models the different sequence regions of a signal peptide and the different regions of a transmembrane protein in a series of interconnected states. Training was done on a newly assembled and curated dataset. Compared to TMHMM and SignalP, errors coming from cross-prediction between transmembrane segments and signal peptides were reduced substantially by Phobius. False classifications of signal peptides were reduced from 26.1% to 3.9% and false classifications of transmembrane helices were reduced from 19.0% to 7.7%. Phobius was applied to the proteomes of Homo sapiens and Escherichia coli. Here we also noted a drastic reduction of false classifications compared to TMHMM/SignalP, suggesting that Phobius is well suited for whole-genome annotation of signal peptides and transmembrane regions. The method is available at as well as at 相似文献
18.
神经网络在蛋白质二级结构预测中的应用 总被引:3,自引:0,他引:3
介绍了蛋白质二级结构预测的研究意义,讨论了用在蛋白质二级结构预测方面的神经网络设计问题,并且较详尽地评述了近些年来用神经网络方法在蛋白质二级结构预测中的主要工作进展情况,展望了蛋白质结构预测的前景。 相似文献
19.
Recent progress in structure determination techniques has led to a significant growth in the number of known membrane protein structures, and the first structural genomics projects focusing on membrane proteins have been initiated, warranting an investigation of appropriate bioinformatics strategies for optimal structural target selection for these molecules. What determines a membrane protein fold? How many membrane structures need to be solved to provide sufficient structural coverage of the membrane protein sequence space? We present the CAMPS database (Computational Analysis of the Membrane Protein Space) containing almost 45,000 proteins with three or more predicted transmembrane helices (TMH) from 120 bacterial species. This large set of membrane proteins was subjected to single‐linkage clustering using only sequence alignments covering at least 40% of the TMH present in a given family. This process yielded 266 sequence clusters with at least 15 members, roughly corresponding to membrane structural folds, sufficiently structurally homogeneous in terms of the variation of TMH number between individual sequences. These clusters were further subdivided into functionally homogeneous subclusters according to the COG (Clusters of Orthologous Groups) system as well as more stringently defined families sharing at least 30% identity. The CAMPS sequence clusters are thus designed to reflect three main levels of interest for structural genomics: fold, function, and modeling distance. We present a library of Hidden Markov Models (HMM) derived from sequence alignments of TMH at these three levels of sequence similarity. Given that 24 out of 266 clusters corresponding to membrane folds already have associated known structures, we estimate that 242 additional new structures, one for each remaining cluster, would provide structural coverage at the fold level of roughly 70% of prokaryotic membrane proteins belonging to the currently most populated families. Proteins 2006. © 2006 Wiley‐Liss, Inc. 相似文献
20.
介绍了Apo-CaM、Ca2+-CaM以及CaM与其靶肽及拮抗剂复合体的空间结构.钙调素(calmodulin, CaM)作为细胞多功能的Ca2+受体,在细胞信号转导过程中发挥重要作用.近几年对它的空间结构有了较清楚的了解,使人们能够更明确地认识CaM的Ca2+激活及CaM与其靶酶的作用机制. 相似文献