首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
A structure-based approach for prediction of MHC-binding peptides   总被引:5,自引:0,他引:5  
Identification of immunodominant peptides is the first step in the rational design of peptide vaccines aimed at T-cell immunity. The advances in sequencing techniques and the accumulation of many protein sequences without the purified protein challenge the development of computer algorithms to identify dominant T-cell epitopes based on sequence data alone. Here, we focus on antigenic peptides recognized by cytotoxic T cells. The selection of T-cell epitopes along a protein sequence is influenced by the specificity of each of the processing stages that precede antigen presentation. The most selective of these processing stages is the binding of the peptides to the major histocompatibility complex molecules, and therefore many of the predictive algorithms focus on this stage. Most of these algorithms are based on known binding peptides whose sequences have been used for the characterization of binding motifs or profiles. Here, we describe a structure-based algorithm that does not rely on previous binding data. It is based on observations from crystal structures that many of the bound peptides adopt similar conformations and placements within the MHC groove. The algorithm uses a structural template of the peptide in the MHC groove upon which peptide candidates are threaded and their fit to the MHC groove is evaluated by statistical pairwise potentials. It can rank all possible peptides along a protein sequence or within a suspected group of peptides, directing the experimental efforts towards the most promising peptides. This approach is especially useful when no previous peptide binding data are available.  相似文献   

2.
3.
An approach of encoding for prediction of splice sites using SVM   总被引:1,自引:0,他引:1  
Huang J  Li T  Chen K  Wu J 《Biochimie》2006,88(7):923-929
In splice sites prediction, the accuracy is lower than 90% though the sequences adjacent to the splice sites have a high conservation. In order to improve the prediction accuracy, much attention has been paid to the improvement of the performance of the algorithms used, and few used for solving the fundamental issues, namely, nucleotide encoding. In this paper, a predictor is constructed to predict the true and false splice sites for higher eukaryotes based on support vector machines (SVM). Four types of encoding, which were mono-nucleotide (MN) encoding, MN with frequency difference between the true sites and false sites (FDTF) encoding, Pair-wise nucleotides (PN) encoding and PN with FDTF encoding, were applied to generate the input for the SVM. The results showed that PN with FDTF encoding as input to SVM led to the most reliable recognition of splice sites and the accuracy for the prediction of true donor sites and false sites were 96.3%, 93.7%, respectively, and the accuracy for predicting of true acceptor sites and false sites were 94.0%, 93.2%, respectively.  相似文献   

4.
The computer implementation of a peptide drug-design strategy has been developed. The system is named EmPLiCS (Empirical Peptide Ligand Construction System) according to the strategy of the system, which searches for peptide-ligand structures by referring to empirical rules that are derived from known protein 3D structures. The system was tested on several known peptide-protein complexes. The results demonstrated the ability of this system to detect key residues of peptides that are crucial for interaction with their specific proteins. The system also showed the ability to detect the main chain trace of these peptides. Some of the main chain atoms were detected even though the complete primary structures were not reproduced, suggesting that main chain structure is important in peptide-protein recognition. The results of the present study demonstrated that the empirical rules-based system can generate significant information for use in the design of natural peptide drugs.  相似文献   

5.
A classification model of a DNA-binding protein chain was created based on identification of alpha helices within the chain likely to bind to DNA. Using the model, all chains in the Protein Data Bank were classified. For many of the chains classified with high confidence, previous documentation for DNA-binding was found, yet no sequence homology to the structures used to train the model was detected. The result indicates that the chain model can be used to supplement sequence based methods for annotating the function of DNA-binding. Four new candidates for DNA-binding were found, including two structures solved through structural genomics efforts. For each of the candidate structures, possible sites of DNA-binding are indicated by listing the residue ranges of alpha helices likely to interact with DNA.  相似文献   

6.
A classification model of a DNA-binding protein chain was created based on identification of alpha helices within the chain likely to bind to DNA. Using the model, all chains in the Protein Data Bank were classified. For many of the chains classified with high confidence, previous documentation for DNA-binding was found, yet no sequence homology to the structures used to train the model was detected. The result indicates that the chain model can be used to supplement sequence based methods for annotating the function of DNA-binding. Four new candidates for DNA-binding were found, including two structures solved through structural genomics efforts. For each of the candidate structures, possible sites of DNA-binding are indicated by listing the residue ranges of alpha helices likely to interact with DNA.  相似文献   

7.
8.
The synthesis and properties of 4-azido-2-nitrophenyl α-D-mannopyranoside, N-diazoacetyl-β-D-glucopyranosylamine, and 4-diazoacetamidophenyl α-D-mannopyranoside are described. These compounds are potential reagents for photo-affinity labelling of carbohydrate-binding sites in proteins.  相似文献   

9.
An empirical method for the prediction of T-cell epitopes   总被引:5,自引:1,他引:5  
Identification of T-cell epitopes from foreign proteins is the current focus of much research. Methods using simple two or three position motifs have proved useful in epitope prediction for major histocompatibility complex (MHC) class I, but to date not for MHC class II molecules. We utilized data from pool sequence analysis of peptides eluted from two HLA-DR13 alleles to construct a computer algorithm for predicting the probability that a given sequence will be naturally processed and presented on these alleles. We assessed the ability of this method to predict know self-peptides from these DR-13 alleles, DRB1 *1301 and *1302, as well as an immunodominant T-cell epitope. We also compared the predictions of this scoring procedure with the measured binding affinities of a panel of overlapping peptides from hepatitis B virus surface antigen. We concluded that this method may have wide application for the prediction of T-cell epitopes for both MHC class I and class II molecules.  相似文献   

10.
Highly specific prediction of phosphorylation sites in proteins   总被引:1,自引:0,他引:1  
SUMMARY: The prediction of significant short functional protein sequences has inherent problems. In predicting phosphorylation sites, problems came from the shortness of phosphorylation sites, the difficulties in maintaining many different predefined models of binding sites, and the difficulties of obtaining highly sensitive predictions and of obtaining predictions with a constant sensitivity and specificity. The algorithm presented in this paper overcomes these problems. The proposed algorithm PHOSITE is based on the case-based sequence analysis. This enables the prediction of phosphorylation sites with constant specificity and sensitivity. Furthermore, this method leads not only to the prediction of phosphorylation sites in general but also predicts the most probable type of kinase involved. AVAILABILITY: The tool PHOSITE implementing the presented method can be evaluated under the website http://www.phosite.com.  相似文献   

11.
Structure-based prediction of DNA target sites by regulatory proteins   总被引:15,自引:0,他引:15  
Kono H  Sarai A 《Proteins》1999,35(1):114-131
Regulatory proteins play a critical role in controlling complex spatial and temporal patterns of gene expression in higher organism, by recognizing multiple DNA sequences and regulating multiple target genes. Increasing amounts of structural data on the protein-DNA complex provides clues for the mechanism of target recognition by regulatory proteins. The analyses of the propensities of base-amino acid interactions observed in those structural data show that there is no one-to-one correspondence in the interaction, but clear preferences exist. On the other hand, the analysis of spatial distribution of amino acids around bases shows that even those amino acids with strong base preference such as Arg with G are distributed in a wide space around bases. Thus, amino acids with many different geometries can form a similar type of interaction with bases. The redundancy and structural flexibility in the interaction suggest that there are no simple rules in the sequence recognition, and its prediction is not straightforward. However, the spatial distributions of amino acids around bases indicate a possibility that the structural data can be used to derive empirical interaction potentials between amino acids and bases. Such information extracted from structural databases has been successfully used to predict amino acid sequences that fold into particular protein structures. We surmised that the structures of protein-DNA complexes could be used to predict DNA target sites for regulatory proteins, because determining DNA sequences that bind to a particular protein structure should be similar to finding amino acid sequences that fold into a particular structure. Here we demonstrate that the structural data can be used to predict DNA target sequences for regulatory proteins. Pairwise potentials that determine the interaction between bases and amino acids were empirically derived from the structural data. These potentials were then used to examine the compatibility between DNA sequences and the protein-DNA complex structure in a combinatorial "threading" procedure. We applied this strategy to the structures of protein-DNA complexes to predict DNA binding sites recognized by regulatory proteins. To test the applicability of this method in target-site prediction, we examined the effects of cognate and noncognate binding, cooperative binding, and DNA deformation on the binding specificity, and predicted binding sites in real promoters and compared with experimental data. These results show that target binding sites for several regulatory proteins are successfully predicted, and our data suggest that this method can serve as a powerful tool for predicting multiple target sites and target genes for regulatory proteins.  相似文献   

12.
13.
An approach is presented for the stable covalent immobilization of proteins with a high retention of biological activity. First, chemical modification studies were used to establish enzyme structural and functional properties relevant to the covalent immobilization of an enzyme to agarose based supports. Heparinase was used as a model enzyme in this set of studies. Amine modifications result in 75-100% activity loss, but the effect is moderated by a reduction in the degree of derivatization. N-hydroxysuccinimide, 1,1,1-trifluoroethanesulfonic acid, and epoxide activated agarose were utilized to determine the effect of amine reactive supports on immobilized enzyme activity retention. Cysteine modifications resulted in 25-50% loss in activity, but free cysteines were inaccessible to either immobilized bromoacetyl or p-chloromercuribenzoyl groups. Amine reactive coupling chemistries were therefore utilized for the covalent immobilization of heparinase. Second, to ensure maximal stability of the immobile protein-support linkage, the identification and subsequent elimination of the principal sources of protein detachment were systematically investigated. By using high-performance liquid chromatography (HPLC), electrophoresis, and radiolabeling techniques, the relative contributions of four potential detachment mechanisms-support degradation, proteolytic degradation, desorption of noncovalently bound protein, and bond solvolysis-were quantified. The mechanisms of lysozyme, bovine serum albumin, and heparinase leakage from N-hydroxysuccinimide or 1,1,1-trifluoroethanesulfonic acid activated agarose were elucidated. By use of stringent postimmobilization support wash procedures, noncovalently bound protein loss. An effective postimmobilization washing procedure is presented for the removal of adsorbed protein and the complete elimination of immobilized protein loss.  相似文献   

14.
Turn prediction in proteins using a pattern-matching approach   总被引:16,自引:0,他引:16  
We extend the use of amino acid sequence patterns [Cohen, F.E., Abarbanel, R. M., Kuntz, I. D., & Fletterick, R. J. (1983) Biochemistry 22, 4894-4904] to the identification of turns in globular proteins. The approach uses a conservative strategy, combined with a hierarchical search (strongest patterns first) and length-dependent masking, to achieve high accuracy (95%) on a test set of proteins of known structure. Applying the same procedure to homologous families gives a 90% success rate. Straightforward changes are suggested to improve the predictive power. The computer program, written in Lisp, provides a general pattern-recognition language well suited for a number of investigations of protein and nucleic acid sequences.  相似文献   

15.
16.
Nuclear proteins were extracted in 2 M NaCl from membrane-depletednuclei isolated from HL60 cells. Extracted proteins were submittedto affinity chromatography columns containing immobilized glucose,galactose or lactose. The polypeptides present in the differenteluted fractions were resolved by SDS—PAGE and were eithersilver stained or analysed by immunoblotting with monoclonalor polyclonal antibodies, respectively, raised against the glucose-bindingprotein CBP67 and the galactose-binding proteins CBP35 and L14.The results presented here show that HL60 cell nuclei containCBP35 and a glucose-binding lectin of 70 kDa (CBP70). Thesedata account for the previously reported binding of neoglyco-proteinscontaining glucosyl and galactosyl residues to HL60 cell nuclei.Furthermore, the present study provides the new informationthat CBP35 can associate with CBP70 by interactions dependenton the binding of CBP35 to lactose, and the results of someaffinity chromatography experiments strongly suggest that CBP35and CBP70 associate by protein—protein interactions. Thepotential function of this lactose-mediated interaction is discussedwith respect to data recently reported by others showing thatCBP35 is involved in in vitro mRNA splicing and that lactoseinhibits the processing of the pre-RNA substrate. HL60 lectins nucleus protein—protein interactions  相似文献   

17.
We develop an integrated probabilistic model to combine protein physical interactions, genetic interactions, highly correlated gene expression networks, protein complex data, and domain structures of individual proteins to predict protein functions. The model is an extension of our previous model for protein function prediction based on Markovian random field theory. The model is flexible in that other protein pairwise relationship information and features of individual proteins can be easily incorporated. Two features distinguish the integrated approach from other available methods for protein function prediction. One is that the integrated approach uses all available sources of information with different weights for different sources of data. It is a global approach that takes the whole network into consideration. The second feature is that the posterior probability that a protein has the function of interest is assigned. The posterior probability indicates how confident we are about assigning the function to the protein. We apply our integrated approach to predict functions of yeast proteins based upon MIPS protein function classifications and upon the interaction networks based on MIPS physical and genetic interactions, gene expression profiles, tandem affinity purification (TAP) protein complex data, and protein domain information. We study the recall and precision of the integrated approach using different sources of information by the leave-one-out approach. In contrast to using MIPS physical interactions only, the integrated approach combining all of the information increases the recall from 57% to 87% when the precision is set at 57%-an increase of 30%.  相似文献   

18.
An examination of the binding sites of four carbohydrate binding proteins (Escherichia coli lactose repressor, E. coli arabinose-binding protein, yeast hexokinase A and Concanavalin A) revealed certain similarities of amino acid sequences and residues forming hydrogen bonds and hydrophobic interactions with the bound carbohydrate. These were: (i) Asx-Asx, hydrogen bonding to the pyranose ring oxygen and anomeric-OH group; (ii) Arg-X-X-X-(Ser/Thr), or the reverse sequence, with the Arg hydrogen bonding to the pyranose ring oxygen; (iii) Lys-(Ser/Thr)-X-X-Asp, or the reverse sequence and with interchange of the Lys-(Ser/Thr) positions, with hydrogen bonding of either or both the Lys and Asp residues to the -OH groups at carbons 2, 3, 4 or 6; (iv) a diaromatic sequence with possible hydrophobic interactions to the faces of the pyranose ring structure. An algorithm was devised to search the amino acid sequences of a large number of proteins, those known to bind carbohydrates as well as those without known carbohydrate-binding activities, for the four amino acid sequence criteria. The algorithm incorporated a weighted distance value (WDV) to assess the approximate distance between any two criteria, with the WDV being based on the predicted secondary structure of the protein amino acid sequence. When the algorithm using criteria 1 and 2 plus the WDV was applied to the sequences of 125 proteins, the method indicated the presence of the potential carbohydrate-binding site motif for 42% of proteins with known carbohydrate binding, only 8% of proteins were predicted as false positives, and the accuracy of the method was calculated to be 61.6%.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

19.
Apurinic/apyrimidinic (AP) sites, a prominent type of DNA damage, are repaired through the base excision repair mechanism in both prokaryotes and eukaryotes and may interfere with many other cellular processes. A full repertoire of AP site-binding proteins in cells is presently unknown, preventing reliable assessment of harm inflicted by these ubiquitous lesions and of their involvement in the flux of DNA metabolism. We present a proteomics-based strategy for assembling at least a partial catalogue of proteins capable of binding AP sites in DNA. The general scheme relies on the sensitivity of many AP site-bound protein species to NaBH(4) cross-linking. An affinity-tagged substrate is used to facilitate isolation of the cross-linked species, which are then separated and analyzed by mass spectrometry methods. We report identification of seven proteins from Escherichia coli (AroF, DnaK, MutM, PolA, TnaA, TufA, and UvrA) and two proteins from bakers' yeast (ARC1 and Ygl245wp) reactive for AP sites in this system.  相似文献   

20.
Protein folding and protein binding are similar processes. In both, structural units combinatorially associate with each other. In the case of folding, we mostly handle relatively small units, building blocks or domains, that are covalently linked. In the case of multi-molecular binding, the subunits are relatively large and are associated only by non-covalent bonds. Experimentally, the difficulty in the determination of the structures of such large assemblies increases with the complex size and the number of components it contains. Computationally, the prediction of the structures of multi-molecular complexes has largely not been addressed, probably owing to the magnitude of the combinatorial complexity of the problem. Current docking algorithms mostly target prediction of pairwise interactions. Here our goal is to predict the structures of multi-unit associations, whether these are chain-connected as in protein folding, or separate disjoint molecules in the assemblies. We assume that the structures of the single units are known, either through experimental determination or modeling. Our aim is to combinatorially assemble these units to predict their structure. To address this problem we have developed CombDock. CombDock is a combinatorial docking algorithm for the structural units assembly problem. Below, we briefly describe the algorithm and present examples of its various applications to folding and to multi-molecular assemblies. To test the robustness of the algorithm, we use inaccurate models of the structural units, derived either from crystal structures of unbound molecules or from modeling of the target sequences. The algorithm has been able to predict near-native arrangements of the input structural units in almost all of the cases, suggesting that a combinatorial approach can overcome the imperfect shape complementarity caused by the inaccuracy of the models. In addition, we further show that through a combinatorial docking strategy it is possible to enhance the predictions of pairwise interactions involved in a multi-molecular assembly.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号