首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Prediction of the disulfide-bonding state of cysteine in proteins   总被引:5,自引:0,他引:5  
The bonding states of cysteine play important functional and structural roles in proteins. In particular, disulfide bond formation is one of the most important factors influencing the three-dimensional fold of proteins. Proteins of known structure were used to teach computer-simulated neural networks rules for predicting the disulfide-bonding state of a cysteine given only its flanking amino acid sequence. Resulting networks make accurate predictions on sequences different from those used in training, suggesting that local sequence greatly influences cysteines in disulfide bond formation. The average prediction rate after seven independent network experiments is 81.4% for disulfide-bonded and 80.0% for non-disulfide-bonded scenarios. Predictive accuracy is related to the strength of network output activities. Network weights reveal interesting position-dependent amino acid preferences and provide a physical basis for understanding the correlation between the flanking sequence and a cysteine's disulfide-bonding state. Network predictions may be used to increase or decrease the stability of existing disulfide bonds or to aid the search for potential sites to introduce new disulfide bonds.  相似文献   

2.
The task of predicting the cysteine-bonding state in proteins starting from the residue chain is addressed by implementing a new hybrid system that combines a neural network and a hidden Markov model (hidden neural network). Training is performed using 4136 cysteine-containing segments extracted from 969 nonhomologous proteins of well-resolved three-dimensional structure. After a 20-fold cross-validation procedure, the efficiency of the prediction scores as high as 88% and 84%, when measured on cysteine and protein basis, respectively. These results outperform previously described methods for the same task.  相似文献   

3.
4.
Kaur H  Raghava GP 《FEBS letters》2004,564(1-2):47-57
In this study, an attempt has been made to develop a neural network-based method for predicting segments in proteins containing aromatic-backbone NH (Ar-NH) interactions using multiple sequence alignment. We have analyzed 3121 segments seven residues long containing Ar-NH interactions, extracted from 2298 non-redundant protein structures where no two proteins have more than 25% sequence identity. Two consecutive feed-forward neural networks with a single hidden layer have been trained with standard back-propagation as learning algorithm. The performance of the method improves from 0.12 to 0.15 in terms of Matthews correlation coefficient (MCC) value when evolutionary information (multiple alignment obtained from PSI-BLAST) is used as input instead of a single sequence. The performance of the method further improves from MCC 0.15 to 0.20 when secondary structure information predicted by PSIPRED is incorporated in the prediction. The final network yields an overall prediction accuracy of 70.1% and an MCC of 0.20 when tested by five-fold cross-validation. Overall the performance is 15.2% higher than the random prediction. The method consists of two neural networks: (i) a sequence-to-structure network which predicts the aromatic residues involved in Ar-NH interaction from multiple alignment of protein sequences and (ii) a structure-to structure network where the input consists of the output obtained from the first network and predicted secondary structure. Further, the actual position of the donor residue within the 'potential' predicted fragment has been predicted using a separate sequence-to-structure neural network. Based on the present study, a server Ar_NHPred has been developed which predicts Ar-NH interaction in a given amino acid sequence. The web server Ar_NHPred is available at and (mirror site).  相似文献   

5.
In the eucaryotic cell, the formation of disulfide bonds takes place in general inside the endoplasmic reticulum which provides a unique folding environment. The DisulfideDB database gathers information about this biological process with structural, evolutionary and neighborhood information on cysteines in proteins. Mining this information with an association rule discovery program permits to extract some strong rules for the prediction of the disulfide-bonding state of cysteines.  相似文献   

6.

Background

Annotations that describe the function of sequences are enormously important to researchers during laboratory investigations and when making computational inferences. However, there has been little investigation into the data quality of sequence function annotations. Here we have developed a new method of estimating the error rate of curated sequence annotations, and applied this to the Gene Ontology (GO) sequence database (GOSeqLite). This method involved artificially adding errors to sequence annotations at known rates, and used regression to model the impact on the precision of annotations based on BLAST matched sequences.

Results

We estimated the error rate of curated GO sequence annotations in the GOSeqLite database (March 2006) at between 28% and 30%. Annotations made without use of sequence similarity based methods (non-ISS) had an estimated error rate of between 13% and 18%. Annotations made with the use of sequence similarity methodology (ISS) had an estimated error rate of 49%.

Conclusion

While the overall error rate is reasonably low, it would be prudent to treat all ISS annotations with caution. Electronic annotators that use ISS annotations as the basis of predictions are likely to have higher false prediction rates, and for this reason designers of these systems should consider avoiding ISS annotations where possible. Electronic annotators that use ISS annotations to make predictions should be viewed sceptically. We recommend that curators thoroughly review ISS annotations before accepting them as valid. Overall, users of curated sequence annotations from the GO database should feel assured that they are using a comparatively high quality source of information.  相似文献   

7.
The attainment of complete map‐based sequence for rice (Oryza sativa) is clearly a major milestone for the research community. Identifying the localization of encoded proteins is the key to understanding their functional characteristics and facilitating their purification. Our proposed method, RSLpred, is an effort in this direction for genome‐scale subcellular prediction of encoded rice proteins. First, the support vector machine (SVM)‐based modules have been developed using traditional amino acid‐, dipeptide‐ (i+1) and four parts‐amino acid composition and achieved an overall accuracy of 81.43, 80.88 and 81.10%, respectively. Secondly, a similarity search‐based module has been developed using position‐specific iterated‐basic local alignment search tool and achieved 68.35% accuracy. Another module developed using evolutionary information of a protein sequence extracted from position‐specific scoring matrix achieved an accuracy of 87.10%. In this study, a large number of modules have been developed using various encoding schemes like higher‐order dipeptide composition, N‐ and C‐terminal, splitted amino acid composition and the hybrid information. In order to benchmark RSLpred, it was tested on an independent set of rice proteins where it outperformed widely used prediction methods such as TargetP, Wolf‐PSORT, PA‐SUB, Plant‐Ploc and ESLpred. To assist the plant research community, an online web tool ‘RSLpred’ has been developed for subcellular prediction of query rice proteins, which is freely accessible at http://www.imtech.res.in/raghava/rslpred.  相似文献   

8.
Sethi D  Garg A  Raghava GP 《Amino acids》2008,35(3):599-605
The association of structurally disordered proteins with a number of diseases has engendered enormous interest and therefore demands a prediction method that would facilitate their expeditious study at molecular level. The present study describes the development of a computational method for predicting disordered proteins using sequence and profile compositions as input features for the training of SVM models. First, we developed the amino acid and dipeptide compositions based SVM modules which yielded sensitivities of 75.6 and 73.2% along with Matthew’s Correlation Coefficient (MCC) values of 0.75 and 0.60, respectively. In addition, the use of predicted secondary structure content (coil, sheet and helices) in the form of composition values attained a sensitivity of 76.8% and MCC value of 0.77. Finally, the training of SVM models using evolutionary information hidden in the multiple sequence alignment profile improved the prediction performance by achieving a sensitivity value of 78% and MCC of 0.78. Furthermore, when evaluated on an independent dataset of partially disordered proteins, the same SVM module provided a correct prediction rate of 86.6%. Based on the above study, a web server (“DPROT”) was developed for the prediction of disordered proteins, which is available at .  相似文献   

9.
We present a model of amino acid sequence evolution based on a hidden Markov model that extends to transmembrane proteins previous methods that incorporate protein structural information into phylogenetics. Our model aims to give a better understanding of processes of molecular evolution and to extract structural information from multiple alignments of transmembrane sequences and use such information to improve phylogenetic analyses. This should be of value in phylogenetic studies of transmembrane proteins: for example, mitochondrial proteins have acquired a special importance in phylogenetics and are mostly transmembrane proteins. The improvement in fit to example data sets of our new model relative to less complex models of amino acid sequence evolution is statistically tested. To further illustrate the potential utility of our method, phylogeny estimation is performed on primate CCR5 receptor sequences, sequences of l and m subunits of the light reaction center in purple bacteria, guinea pig sequences with respect to lagomorph and rodent sequences of calcitonin receptor and K-substance receptor, and cetacean sequences of cytochrome b.  相似文献   

10.
This study presents an allergenic protein prediction system that appears to be capable of producing high sensitivity and specificity. The proposed system is based on support vector machine (SVM) using evolutionary information in the form of an amino acid position specific scoring matrix (PSSM). The performance of this system is assessed by a 10-fold cross-validation experiment using a dataset consisting of 693 allergens and 1041 non-allergens obtained from Swiss-Prot and Structural Database of Allergenic Proteins (SDAP). The PSSM method produced an accuracy of 90.1% in comparison to the methods based on SVM using amino acid, dipeptide composition, pseudo (5-tier) amino acid composition that achieved an accuracy of 86.3, 86.5 and 82.1% respectively. The results show that evolutionary information can be useful to build more effective and efficient allergen prediction systems.  相似文献   

11.
Wang CC  Chen JH  Yin SH  Chuang WJ 《Proteins》2006,64(1):219-226
Different programs and methods were employed to superimpose protein structures, using members of four very different protein families as test subjects, and the results of these efforts were compared. Algorithms based on human identification of key amino acid residues on which to base the superpositions were nearly always more successful than programs that used automated techniques to identify key residues. Among those programs automatically identifying key residues, MASS could not superimpose all members of some families, but was very efficient with other families. MODELLER, MultiProt, and STAMP had varying levels of success. A genetic algorithm program written for this project did not improve superpositions when results from neighbor-joining and pseudostar algorithms were used as its starting cases, but it always improved superpositions obained by MODELLER and STAMP. A program entitled PyMSS is presented that includes three superposition algorithms featuring human interaction.  相似文献   

12.
N Nagano  M Ota  K Nishikawa 《FEBS letters》1999,458(1):69-71
The differences between disulfide-bonding cystine (Cys_SS) and free cysteine (Cys_SH) residues were examined by analyzing the statistical distribution of both types of residue in proteins of known structure. Surprisingly, Cys_SH residues display stronger hydrophobicity than Cys_SS residues. A detailed survey of atoms which come into contact with the sulfhydryl group (sulfur atom) of Cys_SH revealed those atoms are essentially the same in number and variety as those of the methyl group of isoleucine, but are quite different to those of the hydroxyl group of serine. Moreover, the relationships among amino acids were also determined using the 3D-profile table of known protein structures. Cys_SH was located in the hydrophobic cluster, along with residues such as Met, Trp and Tyr, and was clearly separated from Ser and Thr in the polar cluster. These results imply that free cysteines behave as strongly hydrophobic, and not hydrophilic, residues in proteins.  相似文献   

13.
The affinity-labeling technique is an extremely important method in receptor biochemistry. The 3-nitro-2-pyridinesulfenyl (Npys) group, attached to a mercapto group, can react only with a free thiol group (the beta-mercapto group of cysteine residue) of the target receptor molecules, forming a disulfide bond. This disulfide bonding is mediated through the thiol-disulfide exchange reaction. Unlike other labeling methods, the approach utilizing such chemically activated thiol-containing ligands is able to reproduce an unlabeled protein by treatment with dithiothreitol, a reducing reagent. This provides several unique aspects for the studies elucidating the structure-function relationships between the peptide and the receptor. Based on the SNpys affinity technique, we have achieved the discriminative disulfide-bonding affinity labeling of the three different subtypes of opioid receptors: mu, delta and kappa. This article reviews our novel affinity techniques in the in vitro receptor biochemistry.  相似文献   

14.
MOTIVATION: The prediction of beta-turns is an important element of protein secondary structure prediction. Recently, a highly accurate neural network based method Betatpred2 has been developed for predicting beta-turns in proteins using position-specific scoring matrices (PSSM) generated by PSI-BLAST and secondary structure information predicted by PSIPRED. However, the major limitation of Betatpred2 is that it predicts only beta-turn and non-beta-turn residues and does not provide any information of different beta-turn types. Thus, there is a need to predict beta-turn types using an approach based on multiple sequence alignment, which will be useful in overall tertiary structure prediction. RESULTS: In the present work, a method has been developed for the prediction of beta-turn types I, II, IV and VIII. For each turn type, two consecutive feed-forward back-propagation networks with a single hidden layer have been used where the first sequence-to-structure network has been trained on single sequences as well as on PSI-BLAST PSSM. The output from the first network along with PSIPRED predicted secondary structure has been used as input for the second-level structure-to-structure network. The networks have been trained and tested on a non-homologous dataset of 426 proteins chains by 7-fold cross-validation. It has been observed that the prediction performance for each turn type is improved significantly by using multiple sequence alignment. The performance has been further improved by using a second level structure-to-structure network and PSIPRED predicted secondary structure information. It has been observed that Type I and II beta-turns have better prediction performance than Type IV and VIII beta-turns. The final network yields an overall accuracy of 74.5, 93.5, 67.9 and 96.5% with MCC values of 0.29, 0.29, 0.23 and 0.02 for Type I, II, IV and VIII beta-turns, respectively, and is better than random prediction. AVAILABILITY: A web server for prediction of beta-turn types I, II, IV and VIII based on above approach is available at http://www.imtech.res.in/raghava/betaturns/ and http://bioinformatics.uams.edu/mirror/betaturns/ (mirror site).  相似文献   

15.
16.
A recent paper in this journal has challenged the idea that complex adaptive features of proteins can be explained by known molecular, genetic, and evolutionary mechanisms. It is shown here that the conclusions of this prior work are an artifact of unwarranted biological assumptions, inappropriate mathematical modeling, and faulty logic. Numerous simple pathways exist by which adaptive multi-residue functions can evolve on time scales of a million years (or much less) in populations of only moderate size. Thus, the classical evolutionary trajectory of descent with modification is adequate to explain the diversification of protein functions.  相似文献   

17.

Background

In prior work, a phage engineered with a biofilm-degrading enzyme (dispersin B) cleared artificial, short-term biofilms more fully than the phage lacking the enzyme. An unresolved question is whether the transgene will be lost or maintained during phage growth – its loss would limit the utility of the engineering. Broadly supported evolutionary theory suggests that transgenes will be lost through a ‘tragedy of the commons’ mechanism unless the ecology of growth in biofilms meets specific requirements. We test that theory here.

Results

Functional properties of the transgenic phage were identified. Consistent with the previous study, the dispersin phage was superior to unmodified phage at clearing short term biofilms grown in broth, shown here to be an effect attributable to free enzyme. However, the dispersin phage was only marginally better than control phages on short term biofilms in minimal media and was no better than control phages in clearing long term biofilms. There was little empirical support for the tragedy of the commons framework despite a strong theoretical foundation for its supposed relevance. The framework requires that the transgene imposes an intrinsic cost, yet the transgene was intrinsically neutral or beneficial when expressed from one part of the phage genome. Expressed from a different part of the genome, the transgene did behave as if intrinsically costly, but its maintenance did not benefit from spatially structured growth per se – violating the tragedy framework.

Conclusions

Overall, the transgene was beneficial under many conditions, but no insight to its maintenance was attributable to the established evolutionary framework. The failure likely resides in system details that would be used to parameterize the models. Our study cautions against naive applications of evolutionary theory to synthetic biology, even qualitatively.
  相似文献   

18.
解析蛋白质的三维结构具有重要的生物学意义,更是蛋白质功能研究和理性药物设计的基础。目前解析蛋白质结构最重要的方法是X-射线衍射晶体学解析技术。但是运用该技术解析蛋白质结构的关键是获得高质量的蛋白质晶体。然而,据统计仅有42%的可溶纯化蛋白质能够得到晶体,即不同蛋白质的可结晶性表现不同。由于实验方法验证蛋白质的可结晶性耗时耗力,因此,有研究者运用计算机模拟的方法预测蛋白质的可结晶性,从而节省资源与成本并且提高实验的成功率。本文结合我们的研究工作,介绍了几种目前较为成功的蛋白质可结晶性预测方法及其研究途径。  相似文献   

19.
Methionine adenosyltransferase (MAT, EC 2.5.1.6)-mediated synthesis of S-adenosylmethionine (AdoMet) is a two-step process consisting of the formation of AdoMet and the subsequent cleavage of the tripolyphosphate (PPPi) molecule, a reaction induced, in turn, by AdoMet. The fact that the two activities, AdoMet synthesis and tripolyphosphate hydrolysis, can be measured separately is particularly useful when the site-directed mutagenesis approach is used to determine the functional role of the amino acid residues involved in each. The present report describes the cloning and subsequent functional refolding, using a bacterial expression system, of the MAT gene (GenBank accession number AF179714) from Leishmania donovani, the etiological agent of visceral leishmaniasis. The absolute need to include a sulfhydryl-protection reagent in the refolding buffer for this protein, in conjunction with the rapid inactivation of the functionally refolded protein by N-ethylmaleimide, suggests the presence of crucial cysteine residues in the primary structure of the MAT protein. The seven cysteines in L. donovani MAT were mutated to their isosterical amino acid, serine. The C22S, C44S, C92S and C305S mutants showed a drastic loss of AdoMet synthesis activity compared to the wild type, and the C33S and C47S mutants retained a mere 12% of wild-type MAT activity. C106S mutant activity and kinetics remained unchanged with respect to the wild-type. Cysteine substitutions also modified PPPi cleavage and AdoMet induction. The C22S, C44S and C305S mutants lacked in tripolyphosphatase activity altogether, whereas C33S, C47S and C92S retained low but detectable activity. The behavior of the C92S mutant was notable: its inability to synthesize AdoMet combined with its retention of tripolyphosphatase activity appear to be indicative of the specific involvement of the respective residue in the first step of the MAT reaction.  相似文献   

20.
To study the distinct influences of structure and function on evolution, we propose a minimalist model for proteins with binding pockets, called functional model proteins, based on a shifted-HP model on a two-dimensional square lattice. These model proteins are not maximally compact and contain an empty lattice site surrounded by at least three nearest neighbors, thus providing a binding pocket. Functional model proteins possess a unique native state, cooperative folding and tolerance to mutation. Due to the explicit functionality in these models (by design), we have been able to explore their fitness or evolutionary landscapes, as characterized by the size and distribution of homologous families and by the complexity of the inter-relatedness of the functional model proteins. Mindful that these minimalist models are highly idealized and two-dimensional, functional model proteins should nevertheless provide a useful means for exploring the constraints of maintaining structure and function on the evolution of proteins.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号