首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this study, an attempt has been made to predict the major functions of gramnegative bacterial proteins from their amino acid sequences. The dataset used for training and testing consists of 670 non-redundant gram-negative bacterial proteins (255 of cellular process, 60 of information molecules, 285 of metabolism, and 70 of virulence factors). First we developed an SVM-based method using amino acid and dipeptide composition and achieved the overall accuracy of 52.39% and 47.01%, respectively. We introduced a new concept for the classification of proteins based on tetrapeptides, in which we identified the unique tetrapeptides significantly found in a class of proteins. These tetrapeptides were used as the input feature for predicting the function of a protein and achieved the overall accuracy of 68.66%. We also developed a hybrid method in which the tetrapeptide information was used with amino acid composition and achieved the overall accuracy of 70.75%. A five-fold cross validation was used to evaluate the performance of these methods. The web server VICMpred has been developed for predicting the function of gram-negative bacterial proteins (http://www.imtech.res.in/raghava/vicmpred/).  相似文献   

2.
The complete amino acid sequence of cassowary (Casuarius casuarius) goose type lysozyme was analyzed by direct protein sequencing of peptides obtained by cleavage with trypsin, V8 protease, chymotrypsin, lysyl endopeptidase, and cyanogen bromide. The N-terminal residue of the enzyme was deduced to be a pyroglutamate group by analysis with a LC/MS/MS system equipped with the oMALDI ionization source, and then confirmed by a glutamate aminopeptidase enzyme. The blocked N-terminal is the first reported in this enzyme group. The positions of disulfide bonds in this enzyme were chemically identified as Cys4-Cys60 and Cys18-Cys29. Cassowary lysozyme was proved to consist of 185 amino acid residues and had a molecular mass of 20408 Da calculated from the amino acid sequence. The amino acid sequence of cassowary lysozyme compared to that of reported G-type lysozymes had identities of 90%, 83%, and 81%, for ostrich, goose, and black swan lysozymes, respectively. The amino acid substitutions at PyroGlu1, Glu19, Gly40, Asp82, Thr102, Thr156, and Asn167 were newly detected in this enzyme group. The substituted amino acids that might contribute to substrate binding were found at subsite B (Asn122Ser, Phe123Met). The amino acid sequences that formed three alpha-helices and three beta-sheets were completely conserved. The disulfide bond locations and catalytic amino acid were also strictly conserved. The conservation of the three alpha-helices structures and the location of disulfide bonds were considered to be important for the formation of the hydrophobic core structure of the catalytic site and for maintaining a similar three-dimensional structure in this enzyme group.  相似文献   

3.
It has been found that 1500 tetrapeptides out of 160000 possible combinations occurring in proteins exhibit preference for particular conformational states. Most conformationally stable tetrapeptides obtained in the analysis of a sampling containing 706 proteins are in the alpha-helical form. The features of the amino acid composition of conformationally stable oligopeptides have been studied.  相似文献   

4.
Reliability of the hydropathy method to predict the formation of membrane-spanning alpha-helices by integral membrane proteins and peptides whose structure is known from X-ray crystallography is analysed. It is shown that Kyte-Doolittle hydropathy plots do not predict accurately 22 transmembrane alpha-helices in the reaction centres (RC) of the photosynthetic bacteria Rhodopseudomonas viridis and Rhodobacter sphaeroides (R-26). The accuracy of prediction for these proteins was improved using an optimised Kyte-Doolittle hydrophobicity scale. However, this hydrophobicity scale did not improve the predictions for the alphabeta-peptides of the B800-850 (LH2) complexes of the photosynthetic bacteria Rhodopseudomonas acidophila and Rhodospirillum molischianum, which were excluded from the optimisation procedure. The best and worst predictions of membrane-spanning alpha-helices for the RC proteins and LH2 peptides, respectively, were obtained with a propensity scale (PRC) calculated from the amino acid sequences and X-ray data for the RC proteins. A propensity scale (PLH) obtained using the amino acid sequences and X-ray data for the alphabeta-peptides of the LH2 complexes did not give an acceptable prediction of the transmembrane segments in the LH2 peptides; moreover, it markedly contradicted the PRC scale. Amino acids have been concluded to have no significant preference to localisation in transmembrane segments. Therefore, the predictive ability of the hydropathy methodology appears to be limited: the number of transmembrane segments can be correctly calculated for the best case only, and the lengths and positions of membrane-spanning alpha-helices in a protein amino acid sequence can not be predicted exactly.  相似文献   

5.
Wang J  Feng JA 《Protein engineering》2003,16(11):799-807
This paper reports an extensive sequence analysis of the alpha-helices of proteins. alpha-Helices were extracted from the Protein Data Bank (PDB) and were divided into groups according to their sizes. It was found that some amino acids had differential propensity values for adopting helical conformation in short, medium and long alpha-helices. Pro and Trp had a significantly higher propensity for helical conformation in short helices than in medium and long helices. Trp was the strongest helix conformer in short helices. Sequence patterns favoring helical conformation were derived from a neighbor-dependent sequence analysis of proteins, which calculated the effect of neighboring amino acid type on the propensity of residues for adopting a particular secondary structure in proteins. This method produced an enhanced statistical significance scale that allowed us to explore the positional preference of amino acids for alpha-helical conformations. It was shown that the amino acid pair preference for alpha-helix had a unique pattern and this pattern was not always predictable by assuming proportional contributions from the individual propensity values of the amino acids. Our analysis also yielded a series of amino acid dyads that showed preference for alpha-helix conformation. The data presented in this study, along with our previous study on loop sequences of proteins, should prove useful for developing potential 'codes' for recognizing sequence patterns that are favorable for specific secondary structural elements in proteins.  相似文献   

6.
The three-dimensional structure of goose-type lysozyme (GEWL), determined by x-ray crystallography and refined at high resolution, has similarities to the structures of hen (chicken) egg-white lysozyme (HEWL) and bacteriophage T4 lysozyme (T4L). The nature of the structural correspondence suggests that all three classes of lysozyme diverged from a common evolutionary precursor, even though their amino acid sequences appear to be unrelated (Grütter et al. 1983). In this paper we make detailed comparisons of goose-type, chicken-type, and phage-type lysozymes. The lysozymes have undergone conformational changes at both the global and the local level. As in the globins, there are corresponding alpha-helices that have rigid-body displacements relative to each other, but in some cases corresponding helices have increased or decreased in length, and in other cases there are helices in one structure that have no counterpart in another. Independent of the overall structural correspondence among the three lysozyme backbones is another, distinct correspondence between a set of three consecutive alpha-helices in GEWL and three consecutive alpha-helices in T4L. This structural correspondence could be due, in part, to a common energetically favorable contact between the first and the third helices. There are similarities in the active sites of the three lysozymes, but also one striking difference. Glu 73 (GEWL) spatially corresponds to Glu 35 (HEWL) and to Glu 11 (T4L). On the other hand, there are two aspartates in the GEWL active site, Asp 86 and Asp 97, neither of which corresponds exactly to Asp 52 (HEWL) or Asp 20 (T4L). (The discrepancy in the location of the carboxyl groups is about 10 A for Asp 86 and 4 A for Asp 97.) This lack of structural correspondence may reflect some differences in the mechanisms of action of the three lysozymes. When the amino acid sequences of the three lysozyme types are aligned according to their structural correspondence, there is still no apparent relationship between the sequences except for possible weak matching in the vicinity of the active sites.  相似文献   

7.
The prediction of the secondary structure of proteins from their amino acid sequences remains a key component of many approaches to the protein folding problem. The most abundant form of regular secondary structure in proteins is the alpha-helix, in which specific residue preferences exist at the N-terminal locations. Propensities derived from these observed amino acid frequencies in the Protein Data Bank (PDB) database correlate well with experimental free energies measured for residues at different N-terminal positions in alanine-based peptides. We report a novel method to exploit this data to improve protein secondary structure prediction through identification of the correct N-terminal sequences in alpha-helices, based on existing popular methods for secondary structure prediction. With this algorithm, the number of correctly predicted alpha-helix start positions was improved from 30% to 38%, while the overall prediction accuracy (Q3) remained the same, using cross-validated testing. Although the algorithm was developed and tested on multiple sequence alignment-based secondary structure predictions, it was also able to improve the predictions of start locations by methods that use single sequences to make their predictions. Furthermore, the residue frequencies at N-terminal positions of the improved predictions better reflect those seen at the N-terminal positions of alpha-helices in proteins. This has implications for areas such as comparative modeling, where a more accurate prediction of the N-terminal regions of alpha-helices should benefit attempts to model adjacent loop regions. The algorithm is available as a Web tool, located at http://rocky.bms.umist.ac.uk/elephant.  相似文献   

8.
Protein structure and neutral theory of evolution   总被引:2,自引:0,他引:2  
The neutral theory of evolution is extended to the origin of protein molecules. Arguments are presented which suggest that the amino acid sequences of many globular proteins mainly represent "memorized" random sequences while biological evolution reduces to the "editing" these random sequences. Physical requirements for a functional globular protein are formulated and it is shown that many of these requirement do not involve strategical selection of amino acid sequences during biological evolution but are inherent also for typical random sequences. In particular, it is shown that random sequences of polar and amino acid residues can form alpha-helices and beta-strand with lengths and arrangement along the chain similar to those in real globular proteins. These alpha- and beta-regions in random sequences can form three-dimensional folding patterns also similar to those in proteins. The arguments are presented suggesting that even the tight packing of side groups inside protein core do not require very strong biological selection of amino acid sequences either. Thus many structural features of real proteins can exist also in random sequences and the biological selection is needed mainly for the creation of active site of protein and for their stability under physiological conditions.  相似文献   

9.
Separate proteins for proton-linked transport of D-xylose, L-arabinose, D-galactose, L-rhamnose and L-fucose into Escherichia coli are being studied. By cloning and sequencing the appropriate genes, the amino acid sequences of proteins for D-xylose/H+ symport (XylE), L-arabinose/H+ symport (AraE), and part of the protein for D-galactose/H+ symport (GalP) have been determined. These are homologous, with at least 28% identical amino acid residues conserved in the aligned sequences, although their primary sequences are not similar to those of other E. coli transport proteins for lactose, melibiose, or D-glucose. However, they are equally homologous to the passive D-glucose transport proteins from yeast, rat brain, rat adipocytes, human erythrocytes, human liver, and a human hepatoma cell line. The substrate specificity of GalP from E. coli is similar to that of the mammalian glucose transporters. Furthermore, the activities of GalP, AraE and the mammalian glucose transporters are all inhibited by cytochalasin B and N-ethylmaleimide. Conserved residues in the aligned sequences of the bacterial and mammalian transporters are identified, and the possible roles of some in sugar binding, cation binding, cytochalasin binding, and reaction with N-ethylmaleimide are discussed. Each protein is independently predicted to form 12 hydrophobic, membrane-spanning alpha-helices with a central hydrophilic segment, also comprised of alpha-helix. This unifying structural model of the sugar transporters shares features with other ion-linked transport proteins for citrate or tetracycline.  相似文献   

10.
The effect of changing 1st and 4th amino acid residues on beta-turn preference of tetrapeptide sequences was studied by use of CD spectra of th chromophoric derivatives, which have Dnp- and pNA-groups as the amino and carboxyl substituents, respectively. The effect was examined with the tetrapeptides having such sequences at the 2nd and 3rd positions as -L-Pro-L-Asn-, -L-Pro-Gly-, -L-Pro-D-Ala-, -L-Ala-D-Leu-, -L-Ala-L-Pro-, and -D-Ala-L-Pro-. The beta-turn preferences estimated from the CD intensities of the bands due to exciton interaction were found to depend largely on the configurations of the 1st and 4th amino acid residues. When 1st and 2nd (or 3rd and 4th) residues had the same configuration, decreased intensity of the CD band was observed even if the internal sequence had high beta-turn preference. Terminal Gly residues were favorable for the beta-turn conformation in many of the tetrapeptide sequences examined.  相似文献   

11.
12.
H Aquila  T A Link    M Klingenberg 《The EMBO journal》1985,4(9):2369-2376
We report here, for the first time, the primary structure of uncoupling protein as established by amino acid sequencing. Like the ADP/ATP carrier, this protein has a tripartite structure comprising three similar sequences of approximately 100 residues each. These six 'repeats' exhibit striking conservation of several residues, in particular glycine and proline, at possible structurally strategic positions. Although the two proteins differ strongly in their amino acid composition, their sequences are distantly homologous. Three membrane-spanning alpha-helices can be deduced from hydropathy plots. A modified plot accounting for amphiphilic helices indicates 5-6 such alpha-segments. In addition an amphiphilic beta-strand of membrane-spanning length can be discerned. The tripartite sequence structure is also distinctly reflected in the hydropathy distribution. Based on the membrane disposition of the segments of the ADP/ATP carrier, a model for the transmembrane folding path of the polypeptide chain of the uncoupling protein is proposed.  相似文献   

13.
Methods for predicting peptide chain conformation have been applied to amino acid sequences adjacant to the carbohydrate attachment sites of glycoproteins containing the N-glycosylamine type of protein-carbohydrate linkage. Of 31 glycosylated residues examined 30 occur in sequences favouring turn or loop structures. Twentytwo of the glycosylated asparagine residues occur in tetrapeptides predicted to have the β-turn conformation. Carbohydrate attachment is therefore associated with peptide sequences which favour the formation of β-turn or other turn or loop structures.  相似文献   

14.
A Iu Kuz'minov 《Biofizika》1987,32(2):206-209
The present paper deals with determination of the relationship between the order of the arrangement of amino acids in comparatively short-range oligopeptides (tetrapeptides) and their conformational potentialities. It is shown that the spatial and conformational possibilities of the tetrapeptides composed of the same amino acid residues exhibit high sensibility to their mutual arrangement, i. e. to the amino acid sequence. A detailed conformation analysis vividly demonstrated that the difference in conformational possibilities is manly determined by different conditions of realization of residual interactions. It is shown convincingly that energetic differences of the fragments are due to different interaction contributions for each of the considered fragments.  相似文献   

15.
S Ohno 《Animal genetics》1994,25(Z1):5-11
Actual protein amino acid sequences are very different from random assemblages of 20 varieties of amino acids. The separate survey of 20 unrelated proteins in two steps that included eight of the 18 discussed in this paper, revealed that at the level of 5000 total residues, one out of every 32 tetrapeptides appeared in two or more identical copies, whereas at the level of 10 000 total residues, the frequency was elevated to one out of every 29. It would thus appear that only 60 000 or so, out of the possible 160 000 (204) varieties of tetrapeptides, are regularly used by all proteins. These shall be defined as ubiquitous tetrapeptides. Those tetrapeptides occasionally found to be stray which did not belong to the above group of 60 000 must have been generated by new mutations. Thus, they are expected to return to the group by subsequent mutations. The above ubiquity is due to the cardinal principle of protein construction which is like attracting like. On the average, 28% of each residue is devoted to the formation of homodipeptides such as Leu-Leu, Asn-Asn and Trp-Trp. Consequently, homo-oligopeptides, pentapeptidic and longer, are readily found in two or more proteins unrelated to each other. The next in line among the ubiquitous oligopeptides are those made of similar residues. They usually contain palindromic cores such as Leu-Val-Leu, Ala-Gly-Ala and Lys-Arg-Lys. For example, the hexapeptide Ala-Gly-Ala-Asp-Ala-Ala is shared between human phosphofructokinase and bacterial cytochrome C. Provided that they are longer than 60 residues, all proteins contain repeating oligopeptides, tetrapeptidic to heptapeptidic in length. The above principle of like attracting like is the very reason that hydropathic profiles of most proteins readily yield alternating stretches of hydrophilic and hydrophobic segments.  相似文献   

16.
Glutamine synthetase is encoded by the glnA gene of Escherichia coli and catalyzes the formation of glutamine from ATP, glutamate, and ammonia. A 1922-base pair fragment from a cDNA containing the glnA structural gene for E. coli glutamine synthetase has been sequenced. An open reading frame of 1404 base pairs encodes a protein of 468 amino acid residues with a calculated molecular weight of 51,814. With few exceptions, the amino acid sequence deduced from the DNA sequence agreed very well with the amino acid sequences of several peptides reported previously. The secondary structure predicted for the E. coli enzyme has approximately 36% of the residues in alpha-helices which is in agreement with calculations of approximately 39% based on optical rotatory dispersion data. Comparison of the amino acid sequences of glutamine synthetase from E. coli (468 amino acids) and Anabaena (473 amino acids) (Turner, N. E., Robinson, S. T., and Haselkorn, R. (1983) Nature 306, 337-342) indicates that 260 amino acids are identical and 80 are of the same type (polar or nonpolar) when aligned for maximum homology. Several homologous regions of these two enzymes exist, including the sites of adenylylation and oxidative modification, but the regulation of each enzyme is different.  相似文献   

17.
Amphiphilic alpha-helices play a major role in membrane dependent processes and are manifested in the primary structure of a protein by the periodic appearance of hydrophobic residues. Based on these periodic sequences, the hydrophobic moment was introduced, , which essentially treats the hydrophobicity of amino acid residues as a two-dimensional vector sum and provides a measure of amphiphilicity within regular repeat structures. To identify putative amphiphilic alpha-helix forming sequences, hydrophobic moment analysis assumes an amino acid residue periodicity of 100 and scans protein primary structures to find the 11-residue window with maximal . Taken with the window's mean hydrophobicity, , hydrophobic moment plot analysis uses the coordinate pair, [, ] to classify alpha-helices as either surface active, globular or transmembrane. More recently, this latter analysis has been extended to recognize candidate oblique orientated alpha-helices. Here, the hydrophobic moment is reviewed and data to query the logic of using a fixed window length and a fixed residue angular periodicity in hydrophobic moment analysis are provided. In addition, problems associated with the use of such analysis to predict alpha-helix structure/function relationships are considered.  相似文献   

18.
Proline-induced distortions of transmembrane helices   总被引:14,自引:0,他引:14  
Proline residues in the transmembrane (TM) alpha-helices of integral membrane proteins have long been suspected to play a key role for helix packing and signal transduction by inducing regions of helix distortion and/or dynamic flexibility (hinges). In this study we try to characterise the effect of proline on the geometric properties of TM alpha-helices. We have examined 199 transmembrane alpha-helices from polytopic membrane proteins of known structure. After examining the location of proline residues within the amino acid sequences of TM helices, we estimated the helix axes either side of a hinge and hence identified a hinge residue. This enabled us to calculate helix kink and swivel angles. The results of this analysis show that proline residues occur with a significant concentration in the centre of sequences of TM alpha-helices. In this location, they may induce formation of molecular hinges, located on average about four residues N-terminal to the proline residue. A superposition of proline-containing TM helices structures shows that the distortion induced is anisotropic and favours certain relative orientations (defined by helix kink and swivel angles) of the two helix segments.  相似文献   

19.
For some 20 proteins the mRNA codon base sequence inferred from the amino acid sequence shows remnants of a regular pattern of the triplet mid-base. The repeat is a 4-triplet unit, but the identity of one specific position in the unit is normally varied. It is assumed the evolutionary forerunners of these proteins were repeat tetrapeptides that also showed this variability.Many of these proteins show a high incidence of guanine as triplet first base and are rich in aspartate and glutamate residues, and a low incidence of guanine as the mid-base. This is seen as resulting from an altered, more restricted, codon availability. It is postulated that for these proteins the triplet first base was once an obligatory guanine, and genetic information was restricted to the triplet mid-base. When a 3-base genetic code became effective intense mutational activity would introduce a second phase in the design of protein sequences.  相似文献   

20.
Low-energy conformations of a set of tetrapeptides derived from the small protein bovine pancreatic trypsin inhibitor (BPTI) were generated by a build-up procedure from the low-energy conformations of single amino acid residues. At each stage, various-size fragments were built up from all combinations of smaller ones, the total energies were then minimized, and the low-energy conformations were retained for the next stage. The energies of the tetrapeptides were re-ordered by including the effects of hydration. No information other than the amino acid sequence was used to obtain the low-energy conformations of the hydrated tetrapeptides. The latter were then supplemented with a limited set of simulated NMR distance information, derived from the X-ray structure of BPTI, to provide a basis for building the rest of the whole protein molecule by the same procedure. A total of 189 upper bounds, plus 12 pairs of upper and lower bounds pertaining to the location of the three disulfide bonds in this molecule, were used. Four sets of conformations of the entire molecule were generated by utilizing different combinations of smaller fragments. It was possible to obtain low-energy conformations with small rms deviations, 1.1 to 1.4 A for the alpha-carbons, from the structure derived by X-ray diffraction. The average deviations of the backbone dihedral angles were also low, viz. 23 degrees to 26 degrees.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号