首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We investigated how the folding yield of disulfide-containing globular proteins having positive net charges from crude bacterial inclusion bodies was affected by additives in the folding buffer. In screening folding conditions for human ribonucleases and its derivative, we found that addition of salt (about 0.4 M) to a folding buffer increased the folding yield. This suggested that electrostatic interaction between polyanionic impurities such as nucleic acids and cationic unfolded protein led to the formation of aggregates under the low-salt conditions. Since inclusion bodies were found to contain nucleic acids regardless of the electrostatic nature of the expressed protein, the electrostatic interaction between phosphate moieties of nucleic acids and basic amino acid residues of a denatured protein may be large enough to cause aggregation, and therefore the addition of salt in a folding buffer may generally be useful for promotion of protein folding from crude inclusion bodies. We further systematically investigated additives such as glycerol, guanidium chloride, and urea that are known to act as chemical chaperons, and found that these additives, together with salt, synergistically improved folding yield. This study, suggesting that the addition of salt into the folding buffer is one of the crucial points to be considered, may pave the way for a systematic investigation of the folding conditions of disulfide-containing foreign proteins from crude bacterial inclusion bodies.  相似文献   

2.
Huang JT  Xing DJ  Huang W 《Proteins》2012,80(8):2056-2062
Bioinformatical studies suggest that additional information provided by nucleic acids is necessary to construct protein three-dimensional structures. We find underlying correlations between the contents of bases. All correlations occur at the third codon position of a gene sequence. Four inverse relationships are observed between u(3) and c(3), between a(3) and g(3), between u(3) and g(3), and between c(3) and a(3); and two positive relationships are apparent between u(3) and a(3), and between c(3) and g(3). Their correlation coefficients reach -0.92, -0.89, -0.83, -0.85, 0.83, and 0.66, respectively, for large proteins with multistate folding kinetics. The interconnection of bases can be ascribed to choice of synonymous codons associated with protein folding in vivo. In this study, the refolding rate constants of large proteins correlate with the contents of the third base, suggesting that there is underlying biochemical rationale of guiding protein folding in choosing synonymous codons.  相似文献   

3.
Correlations between genomic GC contents and amino acid frequencies were studied in the homologous sequences of 12 eubacterial genomes. Results show that amino acids encoded by GC-rich codons increases significantly with genomic GC contents, whereas opposite trend was observed in case of amino acids encoded by GC-poor codons. Further studies show all the amino acids do not change in the predicted direction according to their genomic GC pressure, suggesting that protein evolution is not entirely dictated by their nucleotide frequencies. Amino acid substitution matrix calculated among hydrophobic, amphipathic and hydrophilic amino acid groups' shows that amphipathic and hydrophilic amino acids are more frequently substituted by hydrophobic amino acids than from hydrophobic to hydrophilic or amphipathic amino acids. This indicates that nucleotide bias induces a directional changes in proteome composition in such a way that underwent strong changes in hydropathy values. In fact, significant increases in hydrophobicity values have also been observed with the increase of genomic GC contents. Correlations between GC contents and amino acid compositions in three different predicted protein secondary structures show that hydropathy values increases significantly with GC contents in aperiodic and helix structures whereas strand structure remains insensitive with the genomic GC levels. The relative importance of mutation and selection on the evolution of proteins have been discussed on the basis of these results.  相似文献   

4.
The conservation profile of a protein is a curve of the conservation levels of amino acids along the sequence. Biologists are usually more interested in individual points on the curve (namely, the conserved amino acids) than the overall shape of the curve. Here, we show that the conservation curves of proteins bear the imprints of molecules that are evolutionarily coupled to the proteins. Our method is based on recent studies that a sequence conservation profile is quantitatively linked to its structural packing profile. We find that the conservation profiles of nucleic acid (NA) binding proteins are better correlated with the packing profiles of the protein–NA complexes than those of the proteins alone. This indicates that a nucleic acid binding protein evolves to accommodate the nucleic acid in such a way that the residues involved in binding have their conservation levels closely coupled with the specific nucleotides. Proteins 2015; 83:1407–1413. © 2015 Wiley Periodicals, Inc.  相似文献   

5.
A periodic table of codons has been designed where the codons are in regular locations. The table has four fields (16 places in each) one with each of the four nucleotides (A, U, G, C) in the central codon position. Thus, AAA (lysine), UUU (phenylalanine), GGG (glycine), and CCC (proline) were placed into the corners of the fields as the main codons (and amino acids) of the fields. They were connected to each other by six axes. The resulting nucleic acid periodic table showed perfect axial symmetry for codons. The corresponding amino acid table also displaced periodicity regarding the biochemical properties (charge and hydropathy) of the 20 amino acids and the position of the stop signals. The table emphasizes the importance of the central nucleotide in the codons and predicts that purines control the charge while pyrimidines determine the polarity of the amino acids. This prediction was experimentally tested.  相似文献   

6.
Nucleic acid polymers selected from random sequence space constitute an enormous array of catalytic, diagnostic and therapeutic molecules. Despite the fact that proteins are robust polymers with far greater chemical and physical diversity, success in unlocking protein sequence space remains elusive. We have devised a combinatorial strategy for accessing nucleic acid sequence space corresponding to proteins comprising selected amino acid alphabets. Using the SynthOMIC approach (synthesis of ORFs by multimerizing in-frame codons), representative libraries comprising four amino acid alphabets were fused in-frame to the lambda repressor DNA-binding domain to provide an in vivo selection for self-interacting proteins that re-constitute lambda repressor function. The frequency of self-interactors as a function of amino acid composition ranged over five orders of magnitude, from ∼6% of clones in a library comprising the amino acid residues LARE to ∼0.6 in 106 in the MASH library. Sequence motifs were evident by inspection in many cases, and individual clones from each library presented substantial sequence identity with translated proteins by BLAST analysis. We posit that the SynthOMIC approach represents a powerful strategy for creating combinatorial libraries of open reading frames that distils protein sequence space on the basis of three inherent properties: it supports the use of selected amino acid alphabets, eliminates redundant sequences and locally constrains amino acids.  相似文献   

7.
Genetic code redundancy allows most amino acids to be encoded by multiple codons that are non-randomly distributed along coding sequences. An accepted theory explaining the biological significance of such non-uniform codon selection is that codons are translated at different speeds. Thus, varying codon placement along a message may confer variable rates of polypeptide emergence from the ribosome, which may influence the capacity to fold toward the native state. Previous studies report conflicting results regarding whether certain codons correlate with particular structural or folding properties of the encoded protein. This is partly due to different criteria traditionally utilized for predicting translation speeds of codons, including their usage frequencies and the concentration of tRNA species capable of decoding them, which do not always correlate. Here, we developed a metric to predict organism-specific relative translation rates of codons based on the availability of tRNA decoding mechanisms: Watson-Crick, non-Watson-Crick or both types of interactions. We determine translation rates of messages by pulse-chase analyses in living Escherichia coli cells and show that sequence engineering based on these concepts predictably modulates translation rates in a manner that is superior to codon usage frequency, which occur during the elongation phase, and significantly impacts folding of the encoded polypeptide. Finally, we demonstrate that sequence harmonization based on expression host tRNA pools, designed to mimic ribosome movement of the original organism, can significantly increase the folding of the encoded polypeptide. These results illuminate how genetic code degeneracy may function to specify properties beyond amino acid encoding, including folding.  相似文献   

8.
Amino acid substitution plays a vital role in both the molecular engineering of proteins and analysis of structure-activity relationships. High-throughput substitution is achieved by codon randomisation, which generates a library of mutants (a randomised gene library) in a single experiment. For full randomisation, key codons are typically replaced with NNN (64 sequences) or NN(G)(CorT) (32 sequences). This obligates cloning of redundant codons alongside those required to encode the 20 amino acids. As the number of randomised codons increases, there is therefore a progressive loss of randomisation efficiency; the number of genes required per protein rises exponentially. The redundant codons cause amino acids to be represented unevenly; for example, methionine is encoded just once within NNN, whilst arginine is encoded six times. Finally, the organisation of the genetic code makes it impossible to encode functional subsets of amino acids (e.g. polar residues only) in a single experiment. Here, we present a novel solution to randomisation where genetic redundancy is eliminated; the number of different genes equals the number of encoded proteins, regardless of codon number. There is no inherent amino acid bias and any required subset of amino acids may be encoded in one experiment. This generic approach should be widely applicable in studies involving randomisation of proteins.  相似文献   

9.
An empirical relation between the amino acid composition and three-dimensional folding pattern of several classes of proteins has been determined. Computer simulated neural networks have been used to assign proteins to one of the following classes based on their amino acid composition and size: (1) 4α-helical bundles, (2) parallel (α/β)8 barrels, (3) nucleotide binding fold, (4) immunoglobulin fold, or (5) none of these. Networks trained on the known crystal structures as well as sequences of closely related proteins are shown to correctly predict folding classes of proteins not represented in the training set with an average accuracy of 87%. Other folding motifs can easily be added to the prediction scheme once larger databases become available. Analysis of the neural network weights reveals that amino acids favoring prediction of a folding class are usually over represented in that class and amino acids with unfavorable weights are underrepresented in composition. The neural networks utilize combinations of these multiple small variations in amino acid composition in order to make a prediction. The favorably weighted amino acids in a given class also form the most intramolecular interactions with other residues in proteins of that class. A detailed examination of the contacts of these amino acids reveals some general patterns that may help stabilize each folding class. © 1993 Wiley-Liss, Inc.  相似文献   

10.
In addition to the well‐established sense‐antisense complementarity abundantly present in the nucleic acid world and serving as a basic principle of the specific double‐helical structure of DNA, production of mRNA, and genetic code‐based biosynthesis of proteins, sense‐antisense complementarity is also present in proteins, where sense and antisense peptides were shown to interact with each other with increased probability. In nucleic acids, sense‐antisense complementarity is achieved via the Watson‐Crick complementarity of the base pairs or nucleotide pairing. In proteins, the complementarity between sense and antisense peptides depends on a specific hydropathic pattern, where codons for hydrophilic and hydrophobic amino acids in a sense peptide are complemented by the codons for hydrophobic and hydrophilic amino acids in its antisense counterpart. We are showing here that in addition to this pattern of the complementary hydrophobicity, sense and antisense peptides are characterized by the complementary order‐disorder patterns and show complementarity in sequence distribution of their disorder‐based interaction sites. We also discuss how this order‐disorder complementarity can be related to protein evolution.  相似文献   

11.

Background

In plant organelles, specific messenger RNAs (mRNAs) are subjected to conversion editing, a process that often converts the first or second nucleotide of a codon and hence the encoded amino acid. No systematic patterns in converted sites were found on mRNAs, and the converted sites rarely encoded residues located at the active sites of proteins. The role and origin of RNA editing in plant organelles remain to be elucidated.

Results

Here we study the relationship between amino acid residues encoded by edited codons and the structural characteristics of these residues within proteins, e.g., in protein-protein interfaces, elements of secondary structure, or protein structural cores. We find that the residues encoded by edited codons are significantly biased toward involvement in helices and protein structural cores. RNA editing can convert codons for hydrophilic to hydrophobic amino acids. Hence, only the edited form of an mRNA can be translated into a polypeptide with helix-preferring and core-forming residues at the appropriate positions, which is often required for a protein to form a functional three-dimensional (3D) structure.

Conclusion

We have performed a novel analysis of the location of residues affected by RNA editing in proteins in plant organelles. This study documents that RNA editing sites are often found in positions important for 3D structure formation. Without RNA editing, protein folding will not occur properly, thus affecting gene expression. We suggest that RNA editing may have conferring evolutionary advantage by acting as a mechanism to reduce susceptibility to DNA damage by allowing the increase in GC content in DNA while maintaining RNA codons essential to encode residues required for protein folding and activity.  相似文献   

12.
张静  顾宝洪 《动物学研究》1998,19(5):350-358
对编码成熟肽的mRNA二级结构的分析显示,每个密码子在mRNA二级结构中的位置有一定的倾向性,这种倾向性似乎与相应氨基酸的构象性质相一致。大多数编码疏水氨基酸的密码子位于mRNA二级结构中较稳定的茎区;反之,大多数编码亲水氨基酸的密码子位于柔性的环区。这个结果支持了最近得到的关于mRNA与蛋白质之间存在丰三维结构信息传递的结论。  相似文献   

13.
Proteins evolve under a myriad of biophysical selection pressures that collectively control the patterns of amino acid substitutions. These evolutionary pressures are sufficiently consistent over time and across protein families to produce substitution patterns, summarized in global amino acid substitution matrices such as BLOSUM, JTT, WAG, and LG, which can be used to successfully detect homologs, infer phylogenies, and reconstruct ancestral sequences. Although the factors that govern the variation of amino acid substitution rates have received much attention, the influence of thermodynamic stability constraints remains unresolved. Here we develop a simple model to calculate amino acid substitution matrices from evolutionary dynamics controlled by a fitness function that reports on the thermodynamic effects of amino acid mutations in protein structures. This hybrid biophysical and evolutionary model accounts for nucleotide transition/transversion rate bias, multi‐nucleotide codon changes, the number of codons per amino acid, and thermodynamic protein stability. We find that our theoretical model accurately recapitulates the complex yet universal pattern observed in common global amino acid substitution matrices used in phylogenetics. These results suggest that selection for thermodynamically stable proteins, coupled with nucleotide mutation bias filtered by the structure of the genetic code, is the primary driver behind the global amino acid substitution patterns observed in proteins throughout the tree of life.  相似文献   

14.
BACKGROUND: The composition and sequence of amino acids in a protein may serve the underlying needs of the nucleic acids that encode the protein (the genome phenotype). In extreme form, amino acids become mere placeholders inserted between functional segments or domains, and--apart from increasing protein length--playing no role in the specific function or structure of a protein (the conventional phenotype). METHODS: We studied the genomes of two malarial parasites and 521 prokaryotes (144 complete) that differ widely in GC% and optimum growth temperature, comparing the base compositions of the protein coding regions and corresponding lengths (kilobases). RESULTS: Malarial parasites show distinctive responses to base-compositional pressures that increase as protein lengths increase. A low-GC% species (Plasmodium falciparum) is likely to have more placeholder amino acids than an intermediate-GC% species (P. vivax), so that homologous proteins are longer. In prokaryotes, GC% is generally greater and AG% is generally less in open reading frames (ORFs) encoding long proteins. The increased GC% in long ORFs increases as species' GC% increases, and decreases as species' AG% increases. In low- and intermediate-GC% prokaryotic species, increases in ORF GC% as encoded proteins increase in length are largely accounted for by the base compositions of first and second (amino acid-determining) codon positions. In high-GC% prokaryotic species, first and third (non-amino acid-determining) codon positions play this role. CONCLUSION: In low- and intermediate-GC% prokaryotes, placeholder amino acids are likely to be well defined, corresponding to codons enriched in G and/or C at first and second positions. In high-GC% prokaryotes, placeholder amino acids are likely to be less well defined. Increases in ORF GC% as encoded proteins increase in length are greater in mesophiles than in thermophiles, which are constrained from increasing protein lengths in response to base-composition pressures.  相似文献   

15.
The theory of "codon-amino acid coevolution" was first proposed by Woese in 1967. It suggests that there is a stereochemical matching - that is, affinity - between amino acids and certain of the base triplet sequences that code for those amino acids. We have constructed a Common Periodic Table of Codons and Amino Acids, where the Nucleic Acid Table showed perfect axial symmetry for codons and the corresponding Amino Acid Table also displayed periodicity regarding the biochemical properties (charge and hydrophobicity) of the 20 amino acids and the position of the stop signals. The Table indicates that the middle (2nd) amino acid in the codon has a prominent role in determining some of the structural features of the amino acids. The possibility that physical contact between codons and amino acids might exist was tested on restriction enzymes. Many recognition site-like sequences were found in the coding sequences of these enzymes and as many as 73 examples of codon-amino acid co-location were observed in the 7 known 3D structures (December 2003) of endonuclease-nucleic acid complexes. These results indicate that the smallest possible units of specific nucleic acid-protein interaction are indeed the stereochemically compatible codons and amino acids.  相似文献   

16.
The universal genetic code includes 20 common amino acids. In addition, selenocysteine (Sec) and pyrrolysine (Pyl), known as the twenty first and twenty second amino acids, are encoded by UGA and UAG, respectively, which are the codons that usually function as stop signals. The discovery of Sec and Pyl suggested that the genetic code could be further expanded by reprogramming stop codons. To search for the putative twenty third amino acid, we employed various tRNA identification programs that scanned 16 archaeal and 130 bacterial genomes for tRNAs with anticodons corresponding to the three stop signals. Our data suggest that the occurrence of additional amino acids that are widely distributed and genetically encoded is unlikely.  相似文献   

17.
Biased usage of synonymous codons has been elucidated under the perspective of cellular tRNA abundance for quite a long time now. Taking advantage of publicly available gene expression data for Saccharomyces cerevisiae, a systematic analysis of the codon and amino acid usages in two different coding regions corresponding to the regular (helix and strand) as well as the irregular (coil) protein secondary structures, have been performed. Our analyses suggest that apart from tRNA abundance, mRNA folding stability is another major evolutionary force in shaping the codon and amino acid usage differences between the highly and lowly expressed genes in S. cerevisiae genome and surprisingly it depends on the coding regions corresponding to the secondary structures of the encoded proteins. This is obviously a new paradigm in understanding the codon usage in S. cerevisiae. Differential amino acid usage between highly and lowly expressed genes in the regions coding for the irregular protein secondary structure in S. cerevisiae is expounded by the stability of the mRNA folded structure. Irrespective of the protein secondary structural type, the highly expressed genes always tend to encode cheaper amino acids in order to reduce the overall biosynthetic cost of production of the corresponding protein. This study supports the hypothesis that the tRNA abundance is a consequence of and not a reason for the biased usage of amino acid between highly and lowly expressed genes.  相似文献   

18.
The complete nucleotide sequences of three cloned cDNAs corresponding to human liver apolipoprotein E (apo-E) mRNA were determined. Analysis of the longest cDNA showed that it contained 1157 nucleotides of mRNA sequence with a 5'-terminal nontranslated region of 61 nucleotides, a signal peptide region corresponding to 18 amino acids, a mature protein region corresponding to 299 amino acids, and a 3'-terminal nontranslated region of 142 nucleotides. The inferred amino acid sequences from two cDNAs were identical and corresponded to the amino acid sequence for plasma apo-E3 that has been reported previously ( Rall , S. C., Jr., Weisgraber , K. H., and Mahley , R. W. (1982) J. Biol. Chem. 257, 4171-4178). The third cDNA differed from the other two cDNAs in five nucleotide positions. Three of these differences occurred in the third nucleotide position of amino acid codons, resulting in no change in the corresponding amino acids at residues Val-85, Ser-223, and Gln-248. The other two altered nucleotides occurred in the first nucleotide position of codons, leading to changes in the amino acids encoded. In the variant sequence, a threonine replaced the normal alanine at residue 99 and a proline replaced the normal alanine at residue 152. We have concluded that the human liver donor was heterozygous for the epsilon 3 genotype. The variant cDNA corresponds to a new, previously undescribed variant form of apo-E in which the amino acid substitutions of the protein are electrophoretically silent; it would probably be undetectable by standard apo-E phenotyping methods. The amino acid substitution at position 152 occurs in a region of apo-E that appears to be important for receptor binding, and it may have clinical significance.  相似文献   

19.
Variations in GC content between genomes have been extensively documented. Genomes with comparable GC contents can, however, still differ in the apportionment of the G and C nucleotides between the two DNA strands. This asymmetric strand bias is known as GC skew. Here, we have investigated the impact of differences in nucleotide skew on the amino acid composition of the encoded proteins. We compared orthologous genes between animal mitochondrial genomes that show large differences in GC and AT skews. Specifically, we compared the mitochondrial genomes of mammals, which are characterized by a negative GC skew and a positive AT skew, to those of flatworms, which show the opposite skews for both GC and AT base pairs. We found that the mammalian proteins are highly enriched in amino acids encoded by CA-rich codons (as predicted by their negative GC and positive AT skews), whereas their flatworm orthologs were enriched in amino acids encoded by GT-rich codons (also as predicted from their skews). We found that these differences in mitochondrial strand asymmetry (measured as GC and AT skews) can have very large, predictable effects on the composition of the encoded proteins.  相似文献   

20.
The cDNA clones encoding the precursor form of glycinin A3B4 subunit have been identified from a library of soybean cotyledonary cDNA clones in the plasmid pBR322 by a combination of differential colony hybridizations, and then by immunoprecipitation of hybrid-selected translation product with A3-mono-specific antiserum. A recombinant plasmid, designated pGA3B41425, from one of six clones covering codons for the NH2-terminal region of the subunit was sequenced, and the amino acid sequence was inferred from the nucleotide sequence, which showed that the mRNA codes for a precursor protein of 516 amino acids. Analysis of this cDNA also showed that it contained 1786 nucleotides of mRNA sequence with a 5'-terminal nontranslated region of 46 nucleotides, a signal peptide region corresponding to 24 amino acids, an A3 acidic subunit region corresponding to 320 amino acids followed by a B4 basic subunit region corresponding to 172 amino acids, and a 3'-terminal nontranslated region of 192 nucleotides, which contained two characteristic AAUAAA sequences that ended 110 nucleotides and 26 nucleotides from a 3'-terminal poly(A) segment, respectively. Our results confirm that glycinin is synthesized as precursor polypeptides which undergo post-translational processing to form the nonrandom polypeptide pairs via disulfide bonds. The inferred amino acid sequence of the mature basic subunit, B4, was compared to that of the basic subunit of pea legumin, Leg Beta, which contained 185 amino acids. Using an alignment that permitted a maximum homology of amino acids, it was found that overall 42% of the amino acid positions are identical in both proteins. These results led us to conclude that both storage proteins have a common ancestor.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号