首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Knowledge of amino acid composition, alone, is verified here to be sufficient for recognizing the structural class, α, β, α+β, or α/β of a given protein with an accuracy of 81%. This is supported by results from exhaustive enumerations of all conformations for all sequences of simple, compact lattice models consisting of two types (hydrophobic and polar) of residues. Different compositions exhibit strong affinities for certain folds. Within the limits of validity of the lattice models, two factors appear to determine the choice of particular folds: 1) the coordination numbers of individual sites and 2) the size and geometry of non-bonded clusters. These two properties, collectively termed the distribution of non-bonded contacts, are quantitatively assessed by an eigenvalue analysis of the so-called Kirchhoff or adjacency matrices obtained by considering the non-bonded interactions on a lattice. The analysis permits the identification of conformations that possess the same distribution of non-bonded contacts. Furthermore, some distributions of non-bonded contacts are favored entropically, due to their high degeneracies. Thus, a competition between enthalpic and entropic effects is effective in determining the choice of a distribution for a given composition. Based on these findings, an analysis of non-bonded contacts in protein structures was made. The analysis shows that proteins belonging to the four distinct folding classes exhibit significant differences in their distributions of non-bonded contacts, which more directly explains the success in predicting structural class from amino acid composition. Proteins 29:172–185, 1997. Published 1997 Wiley-Liss, Inc.
  • 1 This article is a US Goverment work and, as such, is in the public domain in the United States of America.
  •   相似文献   

    3.
    4.
    Using an information theoretic formalism, we optimize classes of amino acid substitution to be maximally indicative of local protein structure. Our statistically-derived classes are loosely identifiable with the heuristic constructions found in previously published work. However, while these other methods provide a more rigid idealization of physicochemically constrained residue substitution, our classes provide substantially more structural information with many fewer parameters. Moreover, these substitution classes are consistent with the paradigmatic view of the sequence-to-structure relationship in globular proteins which holds that the three-dimensional architecture is predominantly determined by the arrangement of hydrophobic and polar side chains with weak constraints on the actual amino acid identities. More specific constraints are imposed on the placement of prolines, glycines, and the charged residues. These substitution classes have been used in highly accurate predictions of residue solvent accessibility. They could also be used in the identification of homologous proteins, the construction and refinement of multiple sequence alignments, and as a means of condensing and codifying the information in multiple sequence alignments for secondary structure prediction and tertiary fold recognition. © 1996 Wiley-Liss, Inc.  相似文献   

    5.
    Highly expressed genes in any species differ in the usage frequency of synonymous codons. The relative recurrence of an event of the favored codon pair (amino acid pairs) varies between gene and genomes due to varying gene expression and different base composition. Here we propose a new measure for predicting the gene expression level, i.e., codon plus amino bias index (CABI). Our approach is based on the relative bias of the favored codon pair inclination among the genes, illustrated by analyzing the CABI score of the Medicago truncatula genes. CABI showed strong correlation with all other widely used measures (CAI, RCBS, SCUO) for gene expression analysis. Surprisingly, CABI outperforms all other measures by showing better correlation with the wet-lab data. This emphasizes the importance of the neighboring codons of the favored codon in a synonymous group while estimating the expression level of a gene.  相似文献   

    6.
    We derive an analytic expression for site-specific stationary distributions of amino acids from the structurally constrained neutral (SCN) model of protein evolution with conservation of folding stability. The stationary distributions that we obtain have a Boltzmann-like shape, and their effective temperature parameter, measuring the limit of divergent evolutionary changes at a given site, can be predicted from a site-specific topological property, the principal eigenvector of the contact matrix of the native conformation of the protein. These analytic results, obtained without free parameters, are compared with simulations of the SCN model and with the site-specific amino acid distributions obtained from the Protein Data Bank. These results also provide new insights into how the topology of a protein fold influences its designability, i.e., the number of sequences compatible with that fold. The dependence of the effective temperature on the principal eigenvector decreases for longer proteins, as a possible consequence of the fact that selection for thermodynamic stability becomes weaker in this case.  相似文献   

    7.
    8.
    A protein is usually classified into one of the following four structural classes: all alpha, all beta, (alpha + beta) and alpha/beta. In this paper, based on the maximum correlation-coefficient principle, a new formulation is proposed for predicting the structural class of a protein according to its amino acid composition. Calculations have been made for a development set of proteins from which the amino acid compositions for the standard structural classes were derived, and an independent set of proteins which are outside the development set. The former can test the self consistency of a method and the latter can test its extrapolating effectiveness. In both cases, the results showed that the new method gave a considerably higher rate of correct prediction than any of the previous methods, implying that a significant improvement has been achieved by implementing the maximum-correlation-coefficient principle in the new method.  相似文献   

    9.
    Zhang TL  Ding YS 《Amino acids》2007,33(4):623-629
    Compared with the conventional amino acid composition (AA), the pseudo amino acid composition (PseAA) as originally introduced by Chou can incorporate much more information of a protein sequence; this remarkably enhances the power to use a discrete model for predicting various attributes of a protein. In this study, based on the concept of Chou's PseAA, a 46-D (dimensional) PseAA was formulated to represent the sample of a protein and a new approach based on binary-tree support vector machines (BTSVMs) was proposed to predict the protein structural class. BTSVMs algorithm has the capability in solving the problem of unclassifiable data points in multi-class SVMs. The results by both the 10-fold cross-validation and jackknife tests demonstrate that the predictive performance using the new PseAA (46-D) is better than that of AA (20-D), which is widely used in many algorithms for protein structural class prediction. The results obtained by the new approach are quite encouraging, indicating that it can at least play a complimentary role to many of the existing methods and is a useful tool for predicting many other protein attributes as well.  相似文献   

    10.
    F Yamao  Y Andachi  A Muto  T Ikemura    S Osawa 《Nucleic acids research》1991,19(22):6119-6122
    Transfer RNAs of Mycoplasma capricolum were separated by two-dimensional polyacrylamide gel electrophoresis, and the relative abundance of each of the 28 known tRNA species was measured. There existed a correlation between the relative amount of isoacceptor tRNAs and the frequency in choosing synonymous codons that could be translated by the isoacceptors. Furthermore, it was observed that the total amount of tRNAs for a particular amino acid was paralleled by the composition of the amino acid in ribosomal proteins. A similar relationship was obtained from reexamination of the previous data on Escherichia coli tRNAs, suggesting that the amount of tRNAs for an amino acid is affected by the usage of the amino acid in proteins.  相似文献   

    11.
    It is a critical challenge to develop automated methods for fast and accurately determining the structures of proteins because of the increasingly widening gap between the number of sequence-known proteins and that of structure-known proteins in the post-genomic age. The knowledge of protein structural class can provide useful information towards the determination of protein structure. Thus, it is highly desirable to develop computational methods for identifying the structural classes of newly found proteins based on their primary sequence. In this study, according to the concept of Chou's pseudo amino acid composition (PseAA), eight PseAA vectors are used to represent protein samples. Each of the PseAA vectors is a 40-D (dimensional) vector, which is constructed by the conventional amino acid composition (AA) and a series of sequence-order correlation factors as original introduced by Chou. The difference among the eight PseAA representations is that different physicochemical properties are used to incorporate the sequence-order effects for the protein samples. Based on such a framework, a dual-layer fuzzy support vector machine (FSVM) network is proposed to predict protein structural classes. In the first layer of the FSVM network, eight FSVM classifiers trained by different PseAA vectors are established. The 2nd layer FSVM classifier is applied to reclassify the outputs of the first layer. The results thus obtained are quite promising, indicating that the new method may become a useful tool for predicting not only the structural classification of proteins but also their other attributes.  相似文献   

    12.
    The fidelity of codon reading was examined in amino acid starved Escherichia coli. In one case the level of misincorporation of methionine was measured at an isoleucine residue encoded by either the commonly used AUU codon or the rarely used AUA codon. In this situation we found the frequency of methionine misincorporation to be very low and to be unaffected by the identity of the isoleucine codon. In other experiments histidine misincorporation for glutamine was measured in glutamine starved cells with normal levels of histidine-specific tRNA and cells overproducing this tRNA. Cells overproducing the tRNA had higher levels of misincorporation.  相似文献   

    13.
    Monoclonal antibodies (mAb) specific for mercuric ions were isolated from BALB/c mice injected with a mercury-containing, hapten-carrier complex. The antibodies reacted by enzyme-linked immunosorbent assay with bovine serum albumin-glutathione-mercuric chloride (BSA-GSH-HgCl) but not with BSA-GSH without mercury. Nucleotide sequences from polymerase chain reaction products encoding six of the antibody heavy-chain variable regions and seven light-chain variable regions revealed that all the antibodies contained an unpaired cysteine residue in one hypervariable region, which is unusual for murine antibodies. Mutagenesis of the cysteine to either tyrosine or serine in one of the Hg-binding antibodies, mAb 4A10, eliminated mercury binding. However, of two influenza-specific antibodies that contain cysteine residues at the same position as mAb 4A10, one reacted with mercury, although not so strongly as 4A10, whereas the other did not react at all. These results suggested that, in addition to an unpaired cysteine, there are other structural features, not yet identified, that are important for creating an appropriate environment for mercury binding. The antibodies described here could be useful for investigating mechanisms of metal-protein interactions and for characterizing antibody responses to structurally simple haptens.  相似文献   

    14.
    When the amino acid usage of all completely sequenced prokaryotes is studied by multivariate analysis (MVA), it is known that the genomic molar content of guanine plus cytosine (GC) and optimal growth temperature (Topt) have a dominant effect. Furthermore, these two factors are associated to the first two axes of different MVA, and thus, nearly independent among them. However, it was recently shown that for several Families of prokaryotes there are significant and positive correlations between GC and Topt. This trend is particularly clear within Bacillaceae, where there are species displaying a broad range of variations for these two factors. In this paper we report that (a) Topt and genomic GC are the main factors shaping amino acid usage but are not independent between them, (b) the usage of cysteine is the second source of variability, and finally (c) the global hydrophobicity of the encoded proteins of each species is the third main factor.  相似文献   

    15.
    Sau K  Gupta SK  Sau S  Mandal SC  Ghosh TC 《Bio Systems》2006,85(2):107-113
    Synonymous codon and amino acid usage biases have been investigated in 903 Mimivirus protein-coding genes in order to understand the architecture and evolution of Mimivirus genome. As expected for an AT-rich genome, third codon positions of the synonymous codons of Mimivirus carry mostly A or T bases. It was found that codon usage bias in Mimivirus genes is dictated both by mutational pressure and translational selection. Evidences show that four factors such as mean molecular weight (MMW), hydropathy, aromaticity and cysteine content are mostly responsible for the variation of amino acid usage in Mimivirus proteins. Based on our observation, we suggest that genes involved in translation, DNA repair, protein folding, etc., have been laterally transferred to Mimivirus a long ago from living organism and with time these genes acquire the codon usage pattern of other Mimivirus genes under selection pressure.  相似文献   

    16.
    Despite the degeneracy of the genetic code, whereby different codons encode the same amino acid, alternative codons and amino acids are utilized nonrandomly within and between genomes. Such biases in codon and amino acid usage have been demonstrated extensively in prokaryote genomes and likely reflect a balance between the action of mutation, selection, and genetic drift. Here, we quantify the effects of selection and mutation drift as causes of codon and amino acid-usage bias in a large collection of nematode partial genomes from 37 species spanning approximately 700 Myr of evolution, as inferred from expressed sequence tag (EST) measures of gene expression and from base composition variation. Average G + C content at silent sites among these taxa ranges from 10% to 63%, and EST counts range more than 100-fold, underlying marked differences between the identities of major codons and optimal codons for a given species as well as influencing patterns of amino acid abundance among taxa. Few species in our sample demonstrate a dominant role of selection in shaping intragenomic codon-usage biases, and these are principally free living rather than parasitic nematodes. This suggests that deviations in effective population size among species, with small effective sizes among parasites, are partly responsible for species differences in the extent to which selection shapes patterns of codon usage. Nevertheless, a consensus set of optimal codons emerges that is common to most taxa, indicating that, with some notable exceptions, selection for translational efficiency and accuracy favors similar sets of codons regardless of the major codon-usage trends defined by base compositional properties of individual nematode genomes.  相似文献   

    17.
    Correspondence analysis of amino acid usage was applied to 14,815 complete proteins from the human genome. We found that three major factors influence the variability of amino acidic composition of these proteins, explaining, respectively 20.4%, 14.7%, and 9.9% of the total variability. The first trend is strongly correlated with the GC content of first and second codon positions and is also significantly correlated with the GC level of the corresponding flanking regions and introns. Therefore, the main force shaping amino acid usage among human proteins are the compositional constraints determined by the isochore in which each gene is embedded. The second trend correlates with the hydropathy of each protein and with the frequency of beta-strands. Finally, the third trend is strongly associated with the usage of Cys and the frequency of alpha-helices.  相似文献   

    18.
    Biased usage of synonymous codons has been elucidated under the perspective of cellular tRNA abundance for quite a long time now. Taking advantage of publicly available gene expression data for Saccharomyces cerevisiae, a systematic analysis of the codon and amino acid usages in two different coding regions corresponding to the regular (helix and strand) as well as the irregular (coil) protein secondary structures, have been performed. Our analyses suggest that apart from tRNA abundance, mRNA folding stability is another major evolutionary force in shaping the codon and amino acid usage differences between the highly and lowly expressed genes in S. cerevisiae genome and surprisingly it depends on the coding regions corresponding to the secondary structures of the encoded proteins. This is obviously a new paradigm in understanding the codon usage in S. cerevisiae. Differential amino acid usage between highly and lowly expressed genes in the regions coding for the irregular protein secondary structure in S. cerevisiae is expounded by the stability of the mRNA folded structure. Irrespective of the protein secondary structural type, the highly expressed genes always tend to encode cheaper amino acids in order to reduce the overall biosynthetic cost of production of the corresponding protein. This study supports the hypothesis that the tRNA abundance is a consequence of and not a reason for the biased usage of amino acid between highly and lowly expressed genes.  相似文献   

    19.
    Local determinants of 3(10)-helix stabilization have been ascertained from the analysis of the crystal structure data base. We have clustered all 5-length substructures from 51 nonhomologous proteins into classes based on the conformational similarity of their backbone dihedral angles. Several clusters, derived from 3(10)-helices and multiple-turn conformations, had strong amino acid sequence patterns not evident among alpha-helices. Aspartate occurred over twice as frequently in the N-cap position of 3(10)-helices as in the N-cap position of alpha-helices. Unlike alpha-helices, 3(10)-helices had few C-termini ending in a left-handed alpha conformation; most 3(10) C-caps adopted an extended conformation. Differences in the distribution of hydrophobic residues among 3(10)- and alpha-helices were also apparent, producing amphipathic 3(10)-helices. Local interactions that stabilize 3(10)-helices can be inferred both from the strong amino acid preferences found for these short helices, as well as from the existence of substructures in which tertiary interactions replace consensus local interactions. Because the folding and unfolding of alpha-helices have been postulated to proceed through reverse-turn and 3(10)-helix intermediates, sequence differences between 3(10)- and alpha-helices can also lend insight into factors influencing alpha-helix initiation and propagation.  相似文献   

    20.
    The complete amino acid sequence of the calcium-binding protein (CaBP) from pig intestinal mucosa has been determined: Ac-Ser-Ala-Gln-Lys-Ser-Pro-Ala-Glu-Leu-Lys-Ser-Ile-Phe-Glu-Lys-Tyr-Ala-Ala-Lys-Glu-Gly-Asp-Pro-Asn-Gln-Leu-Ser-Lys-Glu-Glu-Leu-Lys-Gln-Leu-Ile-Gln-Ala-Glu-Phe-Pro-Ser-Leu-Leu-Lys-Gly-Pro-Arg-Thr-Leu-Asp-Asp-Leu-Phe-Gln-Glu-Leu-Asp-Lys-Asn-Gly-Asn-Gly-Glu-Val-Ser-Phe-Glu-Glu-Phe-Gln-Val-Leu-Val-Lys-Lys-Ile-Ser-Gln-OH. The N-terminal octapeptide sequence was determined by mass spectrometric analysis by Morris and Dell. The first 45 residues of bovine CaBP differ only in six positions from the corresponding sequence of the porcine protein, except that the sequence starts in position two of the porcine sequence. The mammalian intestinal CaBP's belong to the troponin-C superfamily on the basis of an analysis by Barker and Dayhoff.  相似文献   

    设为首页 | 免责声明 | 关于勤云 | 加入收藏

    Copyright©北京勤云科技发展有限公司  京ICP备09084417号