首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The antigenic index: a novel algorithm for predicting antigenic determinants   总被引:39,自引:0,他引:39  
In this paper, we introduce a computer algorithm which can beused to predict the topological features of a protein directlyfrom its primary amino acid sequence. The computer program generatesvalues for surface accessibility parameters and combines thesevalues with those obtained for regional backbone flexibilityand predicted secondary structure. The output of this algorithm,the antigenic index, is used to create a linear surface contourprofile of the protein. Because most, if not all, antigenicsites are located within surface exposed regions of a protein,the program offers a reliable means of predicting potentialantigenic determinants. We have tested the ability of this programto generate accurate surface contour profiles and predict antigenicsites from the linear amino acid sequences of well-characterizedproteins and found a strong correlation between the predictionsof the antigenic index and known structural and biological data. Received on August 17, 1987; accepted on December 31, 1987  相似文献   

2.
A flexible package designed to study protein structure is described.The package is devoted to the analysis of protein sequencesby drawing structural profiles of specific structure-relatedamino acid parameters. An Aminoacidic Parameters Data Bank (CHAMP)containing 32 different series of physico-chemical parametersof amino acids is available. Sequences can be loaded from anyASCII format data bank or from keyboard. The program possessesa routine which enables easy updating of the protein data bankand CHAMP Data Bank. FAST reads statistical correlations betweentwo plots in order to identify structural similarities. Plotscan be printed, saved or used for correlation, comparison orgraph overlap by using common spreadsheets (e.g. Lotus 123).Plots can be smoothed by a running mean or a running median.The program also has a special feature—a global flexibilityanalysis of proteins. The package runs on IBM or compatiblesand requires DOS 3.0 or later. Received on June 20, 1989; accepted on August 2, 1989  相似文献   

3.
MOTIVATION: It is known that the physico-chemical characteristics of proteins underlying specific folding of the polypeptide chain and the protein function are evolutionary conserved. Detection of such characteristics while analyzing homologous sequences would expand essentially the knowledge on protein function, structure, and evolution. These characteristics are maintained constant, in particular, by co-ordinated substitutions. In this process, the destabilizing effect of a substitution may be compensated by another substitution at a different position within the same protein, making the overall change in this protein characteristic insignificant. Consequently, the patterns of co-ordinated substitutions contain important information on conserved physico-chemical properties of proteins, requiring their investigation and development of the corresponding methods and software for correlation analysis of protein sequences available to a wide range of users. RESULTS: A software package for analyzing correlated amino acid substitutions at different positions within aligned protein sequences was developed. The approach implies searching for evolutionary conserved physico-chemical characteristics of proteins based on the information on the pairwise correlations of amino acid substitutions at different protein positions. The software was applied to analyze DNA-binding domains of the homeodomain class. As a result, two conservative physico-chemical characteristics preserved due to the co-ordinated substitutions at certain groups of positions in the protein sequence. Possible functional roles of these characteristics are discussed. AVAILABILITY: The program package is available at http://wwwmgs.bionet.nsc.ru/programs/CRASP/.  相似文献   

4.
The three-dimensional structure of a protein molecule appears to depend on the amino acid sequence of the protein in an as yet incompletely described manner. If the amino acid sequence is replaced by a numerical sequence of values representing a physical or chemical property of amino acids, the resulting numerical sequence is amenable to autocorrelation analysis. Further, if certain geometrical parameters are calculated from the three-dimensional structure of a protein to form a configurational series, pairs of property series and configurational series can be analyzed by cross-correlation techniques. The data base for the analysis was the three-dimensional structures of ten proteins as determined by X-ray crystallography. Such analysis yields the result that the hydrophobicity of an amino acid residue in a protein influences the orientation angle of the amino acid side chain. This result is consistent with the widely current “oil-drop” model of protein structure. Hydrophobicity also appears to influence the backbone dihedral angle φ, but not ψ Such a directional effect cannot be explained by a current model of information transfer in protein helices. The magnitude of the cross correlations does not appear to be satisfactory for construction of a transfer function model for the prediction of general features of protein structure from amino acid sequences.  相似文献   

5.
6.
The presentation by antigen-presenting cells of immunodominant peptide segments in association with major histocompatibility complex (MHC) encoded proteins is fundamental to the efficacy of a specific immune response. One approach used to identify immunodominant segments within proteins has involved the development of predictive algorithms which utilize amino acid sequence data to identify structural characteristics or motifs associated with in vivo antigenicity. The parallel-computing technique termed ‘neural networking’ has recently been shown to be remarkably efficient at addressing the problem of pattern recognition and can be applied to predict protein secondary structure attributes directly from amino acid sequence data. In order to examine the potential of a neural network to generalize peptide structural feature related to binding within class II MHC-encoded proteins, we have trained a neural network to determine whether or not any given amino acid of a protein is part of a peptide segment capable of binding to HLA-DR1. We report that a neural network trained on a data base consisting of peptide segments known to bind to HLA-DR1 is able to generalize features relating to HLA-DR1-binding capacity (r = 0.17 and p = 0.0001).  相似文献   

7.
We have implemented several algorithms, developed by variousauthors for predicting structural features of proteins fromtheir primary structure, on an Apple lle and collected themin a suite, named PROTEUS. This suite incorporates: (i) methodsfor predicting secondary structure; (ii) the algorithm for computingthe hydropathy profile using one out of five available setsof parameters; (ii) the algorithms for calculating the hydrophobicmoment plot; and (iv) for performing the amphipathic analysisusing one out of four available sets of parameters. The suitehas a utility program for storing on a disk the sequence tobe analysed. As an example, we applied some of the methods includedin PROTEUS to predict the structure of a mitochondnal leaderpeptide. The results suggest the occurrence of structural featurespossibly related to the import of proteins into mitochondria. Received on April 30, 1987; accepted on July 21, 1987  相似文献   

8.
9.
A new algorithm to predict the types of membrane proteins is proposed. Besides the amino acid composition of the query protein, the information within the amino acid sequence is taken into account. A formulation of the autocorrelation functions based on the hydrophobicity index of the 20 amino acids is adopted. The overall predictive accuracy is remarkably increased for the database of 2054 membrane proteins studied here. An improvement of about 13% in the resubstitution test and 8% in the jackknife test is achieved compared with those of algorithms based merely on the amino acid composition. Consequently, overall predictive accuracy is as high as 94% and 82% for the resubstitution and jackknife tests, respectively, for the prediction of the five types. Since the proposed algorithm is based on more parameters than those in the amino acid composition approach, the predictive accuracy would be further increased for a larger and more class-balanced database. The present algorithm should be useful in the determination of the types and functions of new membrane proteins. The computer program is available on request.  相似文献   

10.
Using the data from Protein Data Bank the correlations of primary and secondary structures of proteins were analyzed. The correlation values of the amino acids and the eight secondary structure types were calculated, where the position of the amino acid and the position in sequence with the particular secondary structure differ at most 25. The diagrams describing these results indicate that correlations are significant at distances between −9 and 10. The results show that the substituents on Cβ or Cγ atoms of amino acid play major role in their preference for particular secondary structure at the same position in the sequence, while the polarity of amino acid has significant influence on α-helices and strands at some distance in the sequence. The diagrams corresponding to polar amino acids are noticeably asymmetric. The diagrams point out the exchangeability of residues in the proteins; the amino acids with similar diagrams have similar local folding requirements. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

11.
Current methods of prediction of protein conformation are reviewedand the algorithms on which they rely are presented. For non-homologousproteins and after cross-validation the reported methods exhibita probability index, i.e. the per cent of correctly predictedresidues per predicted residues, of 63–65% with a standarddeviation of the order of 7% for three conformational states—helix,ß-strand and coil. This present limitation in theaccuracy of predictions that use only the information of thelocal sequence can be related essentially to the effect of long-rangeinteractions specific for each protein family. The methods basedon sequence similarity can improve the accuracy of predictionby expressing explicitly the homology of the protein to be predictedwith proteins in the database. In these circumstances the probabilityindex can reach 87% with a standard deviation of 6.6%. Thisproperty can be used for modeling homologous proteins by aidingin amino acid sequence alignments. The prediction of the tertiarystructure of a protein is still limited to the case of modelinga structure based on the known three-dimensional structure ofa homologous protein.  相似文献   

12.
Concepts of the uniqueness of the amino acid sequences of proteins were defined in a prior report (Saroff, H. A. and F. A. Kutyna. 1981. “The Uniqueness of Protein Sequences: A Monte Carlo Analysis.”Bull. math. Biol. 43, 619–639), which presented a detailed discussion ofi-uniqueness, i.e. the tendency of small peptides to be repeated within an amino acid sequence of a protein. We now report on the quantitative analysis ofo-uniqueness, which evaluates the tendency of small peptides to be repeated amongst different proteins, usually of a single species. A detailed analysis of theo-uniqueness of several proteins is presented to illustrate the method and the range of values encountered. Uniqueness data on sequences of human proteins in a data bank of sequences containing about 32,500 amino acids are made available in the form of a microfiche. Analysis of biologically active subsequences such as the angiotensins and the enkephalins suggest a tendency of the subsequences contributing to the property ofo-uniqueness to cluster in portions of the parent protein sequence which are biologically active. This property may provide a general method for predicting biologically active areas of proteins. Current data may already be adequate to permit useful predictions, and the rapidly accumulating and interrelated new data on nucleic acid and protein sequences will further enhance the power ofo-uniqueness analysis.  相似文献   

13.
We present a new method, secondary structure prediction by deviation parameter (SSPDP) for predicting the secondary structure of proteins from amino acid sequence. Deviation parameters (DP) for amino acid singlets, doublets and triplets were computed with respect to secondary structural elements of proteins based on the dictionary of secondary structure prediction (DSSP)-generated secondary structure for 408 selected nonhomologous proteins. To the amino acid triplets which are not found in the selected dataset, a DP value of zero is assigned with respect to the secondary structural elements of proteins. The total number of parameters generated is 15,432, in the possible parameters of 25,260. Deviation parameter is complete with respect to amino acid singlets, doublets, and partially complete with respect to amino acid triplets. These generated parameters were used to predict secondary structural elements from amino acid sequence. The secondary structure predicted by our method (SSPDP) was compared with that of single sequence (NNPREDICT) and multiple sequence (PHD) methods. The average value of the percentage of prediction accuracy for αhelix by SSPDP, NNPREDICT and PHD methods was found to be 57%, 44% and 69% respectively for the proteins in the selected dataset. For Β-strand the prediction accuracy is found to be 69%, 21% and 53% respectively by SSPDP, NNPREDICT and PHD methods. This clearly indicates that the secondary structure prediction by our method is as good as PHD method but much better than NNPREDICT method.  相似文献   

14.
The amino acid composition of human alcohol dehydrogenase (ADH) was compared with alcohol dehydrogenases from different organisms and with other proteins. Similar amino acid sequences in human ADH (template protein) and in other proteins were determined by means of an original computer program. Analysis of amino acid motifs reveals that the ADHs from evolutionary more close organisms have more common amino acid sequences. The quantity measure of amino acid similarity was the number of similar motifs in analyzed protein per protein length. This value was measured for ADHs and for different proteins. For ADHs, this quotient was higher than for proteins with different functions; for vertebrates it correlated with evolutionary closeness. The similar operation of motif comparison was made with the help of program complex “MEME”. The analysis of ADHs revealed 4 motifs common to 6 of 10 tested organisms and no such motifs for proteins of different function. The conclusion is that general amino composition is more important for protein function than amino acid order and for enzymes of similar function it better correlates with evolutionary distance between organisms.  相似文献   

15.
We present an empirical method for identification of distinct structural motifs in proteins on the basis of experimentally determined backbone and 13Cβ chemical shifts. Elements identified include the N-terminal and C-terminal helix capping motifs and five types of β-turns: I, II, I′, II′ and VIII. Using a database of proteins of known structure, the NMR chemical shifts, together with the PDB-extracted amino acid preference of the helix capping and β-turn motifs are used as input data for training an artificial neural network algorithm, which outputs the statistical probability of finding each motif at any given position in the protein. The trained neural networks, contained in the MICS (motif identification from chemical shifts) program, also provide a confidence level for each of their predictions, and values ranging from ca 0.7–0.9 for the Matthews correlation coefficient of its predictions far exceed those attainable by sequence analysis. MICS is anticipated to be useful both in the conventional NMR structure determination process and for enhancing on-going efforts to determine protein structures solely on the basis of chemical shift information, where it can aid in identifying protein database fragments suitable for use in building such structures.  相似文献   

16.
Reversed-phase liquid chromatography (LC) directly coupled with electrospray-tandem mass spectrometry (MS/MS) is a successful choice to obtain a large number of product ion spectra from a complex peptide mixture. We describe a search validation program, ScoreRidge, developed for analysis of LC-MS/MS data. The program validates peptide assignments to product ion spectra resulting from usual probability-based searches against primary structure databases. The validation is based only on correlation between the measured LC elution time of each peptide and the deduced elution time from the amino acid sequence assigned to product ion spectra obtained from the MS/MS analysis of the peptide. Sufficient numbers of probable assignments gave a highly correlative curve. Any peptide assignments within a certain tolerance from the correlation curve were accepted for the following arrangement step to list identified proteins. Using this data validation program, host protein candidates responsible for interaction with human hepatitis B virus core protein were identified from a partially purified protein mixture. The present simple and practical program complements protein identification from usual product ion search algorithms and reduces manual interpretation of the search result data. It will lead to more explicit protein identification from complex peptide mixtures such as whole proteome digests from tissue samples.  相似文献   

17.
The prediction of the effects of nonsynonymous single nucleotide polymorphisms (nsSNPs) on function depends critically on exploiting all information available on the three-dimensional structures of proteins. We describe software and databases for the analysis of nsSNPs that allow a user to move from SNP to sequence to structure to function. In both structure prediction and the analysis of the effects of nsSNPs, we exploit information about protein evolution, in particular, that derived from investigations on the relation of sequence to structure gained from the study of amino acid substitutions in divergent evolution. The techniques developed in our laboratory have allowed fast and automated sequence-structure homology recognition to identify templates and to perform comparative modeling; as well as simple, robust, and generally applicable algorithms to assess the likely impact of amino acid substitutions on structure and interactions. We describe our strategy for approaching the relationship between SNPs and disease, and the results of benchmarking our approach -- human proteins of known structure and recognized mutation.  相似文献   

18.
The primary and secondary structure of human plasma apolipoprotein A-I and apolipoprotein E-3 have been analyzed to further our understanding of the secondary and tertiary conformation of these proteins and the structure and function of plasma lipoprotein particles. The methods used to analyze the primary sequence of these proteins used computer programs: (a) to identify repeated patterns within these proteins on the basis of conservative substitutions and similarities within the physicochemical properties of each residue; (b) for local averaging, hydrophobic moment, and Fourier analysis of the physicochemical properties; and (c) for secondary structure prediction of each protein carried out using homology, statistical, and information theory based methods. Circular dichroism was used to study purified lipid-protein complexes of each protein and quantitate the secondary structure in a lipid environment. The data from these analyses were integrated into a single secondary structure prediction to derive a model of each protein. The sequence homology within apolipoproteins A-I, E-3, and A-IV is used to derive a consensus sequence for two 11 amino acid repeating sequences in this family of proteins.  相似文献   

19.
A simple method for displaying the hydropathic character of a protein   总被引:9,自引:0,他引:9  
A computer program that progressively evaluates the hydrophilicity and hydrophobicity of a protein along its amino acid sequence has been devised. For this purpose, a hydropathy scale has been composed wherein the hydrophilic and hydrophobic properties of each of the 20 amino acid side-chains is taken into consideration. The scale is based on an amalgam of experimental observations derived from the literature. The program uses a moving-segment approach that continuously determines the average hydropathy within a segment of predetermined length as it advances through the sequence. The consecutive scores are plotted from the amino to the carboxy terminus. At the same time, a midpoint line is printed that corresponds to the grand average of the hydropathy of the amino acid compositions found in most of the sequenced proteins. In the case of soluble, globular proteins there is a remarkable correspondence between the interior portions of their sequence and the regions appearing on the hydrophobic side of the midpoint line, as well as the exterior portions and the regions on the hydrophilic side. The correlation was demonstrated by comparisons between the plotted values and known structures determined by crystallography. In the case of membrane-bound proteins, the portions of their sequences that are located within the lipid bilayer are also clearly delineated by large uninterrupted areas on the hydrophobic side of the midpoint line. As such, the membrane-spanning segments of these proteins can be identified by this procedure. Although the method is not unique and embodies principles that have long been appreciated, its simplicity and its graphic nature make it a very useful tool for the evaluation of protein structures.  相似文献   

20.
Coordinated amino acid changes in homologous protein families   总被引:4,自引:0,他引:4  
In the tobamovirus coat protein family, amino acid residues at some spatially close positions are found to be substituted in a coordinated manner [Altschuh et al. (1987) J. Mol. Biol., 193, 693]. Therefore, these positions show an identical pattern of amino acid substitutions when amino acid sequences of these homologous proteins are aligned. Based on this principle, coordinated substitutions have been searched for in three additional protein families: serine proteases, cysteine proteases and the haemoglobins. Coordinated changes have been found in all three protein families mostly within structurally constrained regions. This method works with a varying degree of success depending on the function of the proteins, the range of sequence similarities and the number of sequences considered. By relaxing the criteria for residue selection, the method was adapted to cover a broader range of protein families and to study regions of the proteins having weaker structural constraints. The information derived by these methods provides a general guide for engineering of a large variety of proteins to analyse structure-function relationships.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号