首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Protein engineering by inserting stretches of random DNA sequences into target genes in combination with adequate screening or selection methods is a versatile technique to elucidate and improve protein functions. Established compounds for generating semi-random DNA sequences are spiked oligonucleotides which are synthesised by interspersing wild type (wt) nucleotides of the target sequence with certain amounts of other nucleotides. Directed spiking strategies reduce the complexity of a library to a manageable format compared with completely random libraries. Computational algorithms render feasible the calculation of appropriate nucleotide mixtures to encode specified amino acid subpopulations. The crucial element in the ranking of spiked codons generated during an iterative algorithm is the scoring function. In this report three scoring functions are analysed: the sum-of-square-differences function s, a modified cubic function c, and a scoring function m derived from maximum likelihood considerations. The impact of these scoring functions on calculated amino acid distributions is demonstrated by an example of mutagenising a domain surrounding the active site serine of subtilisin-like proteases. At default weight settings of one for each amino acid, the new scoring function m is superior to functions s and c in finding matches to a given amino acid population.  相似文献   

2.
Models of molecular evolution tend to be overly simplistic caricatures of biology that are prone to assigning high probabilities to biologically implausible DNA or protein sequences. Here, we explore how to construct time-reversible evolutionary models that yield stationary distributions of sequences that match given target distributions. By adopting comparatively realistic target distributions,evolutionary models can be improved. Instead of focusing on estimating parameters, we concentrate on the population genetic implications of these models. Specifically, we obtain estimates of the product of effective population size and relative fitness difference of alleles. The approach is illustrated with two applications to protein-coding DNA. In the first, a codon-based evolutionary model yields a stationary distribution of sequences, which, when the sequences are translated,matches a variable-length Markov model trained on human proteins. In the second, we introduce an insertion-deletion model that describes selectively neutral evolutionary changes to DNA. We then show how to modify the neutral model so that its stationary distribution at the amino acid level can match a profile hidden Markov model, such as the one associated with the Pfam database.  相似文献   

3.
Tang L  Gao H  Zhu X  Wang X  Zhou M  Jiang R 《BioTechniques》2012,52(3):149-158
Site-saturation mutagenesis is a powerful tool for protein optimization due to its efficiency and simplicity. A degenerate codon NNN or NNS (K) is often used to encode the 20 standard amino acids, but this will produce redundant codons and cause uneven distribution of amino acids in the constructed library. Here we present a novel "small-intelligent" strategy to construct mutagenesis libraries that have a minimal gene library size without inherent amino acid biases, stop codons, or rare codons of Escherichia coli by coupling well-designed combinatorial degenerate primers with suitable PCR-based mutagenesis methods. The designed primer mixture contains exactly one codon per amino acid and thus allows the construction of small-intelligent mutagenesis libraries with one gene per protein. In addition, the software tool DC-Analyzer was developed to assist in primer design according to the user-defined randomization scheme for library construction. This small-intelligent strategy was successfully applied to the randomization of halohydrin dehalogenases with one or two randomized sites. With the help of DC-Analyzer, the strategy was proven to be as simple as NNS randomization and could serve as a general tool to efficiently randomize target genes at positions of interest.  相似文献   

4.
In the past, 2 kinds of Markov models have been considered to describe protein sequence evolution. Codon-level models have been mechanistic with a small number of parameters designed to take into account features, such as transition-transversion bias, codon frequency bias, and synonymous-nonsynonymous amino acid substitution bias. Amino acid models have been empirical, attempting to summarize the replacement patterns observed in large quantities of data and not explicitly considering the distinct factors that shape protein evolution. We have estimated the first empirical codon model (ECM). Previous codon models assume that protein evolution proceeds only by successive single nucleotide substitutions, but our results indicate that model accuracy is significantly improved by incorporating instantaneous doublet and triplet changes. We also find that the affiliations between codons, the amino acid each encodes and the physicochemical properties of the amino acids are main factors driving the process of codon evolution. Neither multiple nucleotide changes nor the strong influence of the genetic code nor amino acids' physicochemical properties form a part of standard mechanistic models and their views of how codon evolution proceeds. We have implemented the ECM for likelihood-based phylogenetic analysis, and an assessment of its ability to describe protein evolution shows that it consistently outperforms comparable mechanistic codon models. We point out the biological interpretation of our ECM and possible consequences for studies of selection.  相似文献   

5.
cDNA clones encoding the murine senile amyloid protein (ASSAM) have been isolated from animal models of accelerated senescence (SAM-P/1) and from normal aging (SAM-R/1). Immunochemical and protein sequence studies revealed that apolipoprotein (apo) A-II is a serum precursor of ASSAM. A 17-base synthetic oligonucleotide based on residues 39-44 of ASSAM was used as a hybridization probe for screening newly constructed SAM-P/1 and SAM-R/1 liver cDNA libraries. The structure of murine apo A-II cDNA is of interest because of the amino acid substitution found in ASSAM and serum apo A-II of SAM-P; in SAM-R or other random bred slc:ICR mice, amino acid residue 5 of mature apo A-II is proline but, in SAM-P, this amino acid is changed to glutamine. This amino acid replacement is caused by two nucleotide substitutions (CCA for proline codon to CAG for glutamine codon). The third base mutation may not be relevant to the substitution of amino acid. Attention is directed to the relation of this amino acid substitution to the specific deposition of apo A-II, as a tissue amyloid fibril.  相似文献   

6.
X Zou  TK Pham  PC Wright  J Noirel 《Genomics》2012,100(4):240-244
Although protein expression and regulation have been intensively studied, a complete picture of its mechanisms is still to be drawn. Analysis of high-throughput quantitative proteomics data provides a way to better understand protein regulation. Here, we introduce a bioinformatic analysis method to correlate protein regulation with individual amino acid patterns. We compare the amino acid composition between groups of regulated and unregulated proteins and investigate the correlation between codon usage patterns and protein regulation levels in two Sulfolobus species in "biofilm vs planktonic" experiments. The identified amino acids can then be associated with the regulation of specific gene functions. Strikingly, our analysis shows that functional categories of regulated proteins with similar composition and codon usage pattern of specific amino acids behave similarly. This finding can contribute to a better understanding of protein and gene expression regulation and could find applications in gene optimisation.  相似文献   

7.
Miyazawa S 《PloS one》2011,6(12):e28892
BACKGROUND: A mechanistic codon substitution model, in which each codon substitution rate is proportional to the product of a codon mutation rate and the average fixation probability depending on the type of amino acid replacement, has advantages over nucleotide, amino acid, and empirical codon substitution models in evolutionary analysis of protein-coding sequences. It can approximate a wide range of codon substitution processes. If no selection pressure on amino acids is taken into account, it will become equivalent to a nucleotide substitution model. If mutation rates are assumed not to depend on the codon type, then it will become essentially equivalent to an amino acid substitution model. Mutation at the nucleotide level and selection at the amino acid level can be separately evaluated. RESULTS: The present scheme for single nucleotide mutations is equivalent to the general time-reversible model, but multiple nucleotide changes in infinitesimal time are allowed. Selective constraints on the respective types of amino acid replacements are tailored to each gene in a linear function of a given estimate of selective constraints. Their good estimates are those calculated by maximizing the respective likelihoods of empirical amino acid or codon substitution frequency matrices. Akaike and Bayesian information criteria indicate that the present model performs far better than the other substitution models for all five phylogenetic trees of highly-divergent to highly-homologous sequences of chloroplast, mitochondrial, and nuclear genes. It is also shown that multiple nucleotide changes in infinitesimal time are significant in long branches, although they may be caused by compensatory substitutions or other mechanisms. The variation of selective constraint over sites fits the datasets significantly better than variable mutation rates, except for 10 slow-evolving nuclear genes of 10 mammals. An critical finding for phylogenetic analysis is that assuming variable mutation rates over sites lead to the overestimation of branch lengths.  相似文献   

8.
Amino acids are essential measurements for the potential growth stage because of connecting to protein structures and functions. The objective of this paper was to analyze chromosomes feature at plastid region of rice represented by nucleotide, synonymous codon, and amino acid usage to predict gene expression through codon usage pattern. The results showed that the values of the codon adaption index ranged from 0.733 in chromosome 9 to 0.631 in chromosome 8 with full length of these two chromosomes were 3738 and 1635 respectively. The higher value of guanine and cytosine content was 60% in chromosomes 9 while the lower values was 37% in chromosomes 11. Eight chromosomes (ch1, ch2, ch3, ch5, ch7, ch8, ch10, and ch12) were greater value of modified relative codon bias than threshold (threshold: 0.66) especially in cysteine for ch1, ch2, ch5, ch10, and ch12. While other remaining chromosomes were less than the threshold. Relative synonymous codon usage found that the over-represented of amino acids were asparagine, aspartate, cysteine, glutamate, and phenylalanine across all 12 chromosomes. These results would establish a platform for more and further projects concerning rice breeding and genetics and codon optimization in the amino acids for developing varieties. These results also will help breeders to select desirable genes through the genome for improve target traits.  相似文献   

9.
Libraries of random sequence polypeptides are useful as sources of unevolved proteins, novel ligands, and potential lead compounds for the development of vaccines and therapeutics. The expression of small random peptides has been achieved previously using DNA synthesized with equimolar mixtures of nucleotides. For many potential uses of random polypeptide libraries, concerns such as avoiding termination codons and matching target amino acid compositions make more complex designs necessary. In this study, three mixtures of nucleotides, corresponding to the three positions in the codon, were designed such that semirandom DNA synthesized by repeated cycles of the three mixtures created an open reading frame encoding random sequence polypeptides with desired ensemble characteristics. Two methods were used to design the nucleotide mixtures: the manual use of a spreadsheet and a refining grid search algorithm. Using design targets of less than or equal to 1% stop codons and an amino acid composition based on the average ratios observed in natural, globular proteins, the search methods yielded similar nucleotide ratios, Semirandom DNA, synthesized with a designed, three-residue repeat pattern, can encode libraries of very high diversity and represents an important tool for the construction of random polypeptide libraries.  相似文献   

10.
Miller SR 《Molecular ecology》2003,12(5):1237-1246
Determining the molecular basis of enzyme adaptation is central to understanding the evolution of environmental tolerance but is complicated by the fact that not all amino acid differences between ecologically divergent taxa are adaptive. Analysing patterns of nucleotide sequence evolution can potentially guide the investigation of protein adaptation by identifying candidate codon sites on which diversifying selection has been operating. Here, I test whether there is evidence for molecular adaptation of the carbon fixation gene rbcL for a clade of hot spring cyanobacteria in the genus Synechococcus that has diverged in thermotolerance. Amino acid replacements during Synechococcus radiation have resulted in an increase in the number of hydrophobic residues in the RbcLs of more thermotolerant strains. A similar increase in hydrophobicity has been observed for many thermostable proteins. Maximum likelihood models which allow for heterogeneity among codon sites in the ratio of nonsynonymous to synonymous nucleotide substitutions estimated a class of amino acid sites as a target of positive selection. Depending on the model, a single amino acid site that interacts with a flexible element involved in the opening and closing of the active site was estimated with either low or moderate support to be a member of this class. Site-directed mutagenesis approaches are being explored in order to directly test its adaptive significance.  相似文献   

11.
Correspondence analysis of 28 proteomes selected to span the entire realm of prokaryotes revealed universal biases in the proteins' amino acid distribution. Integral Inner Membrane Proteins always form an individual cluster, which can then be used to predict protein localisation in unknown proteomes, independently of the organism's biotope or kingdom. Orphan proteins are consistently rich in aromatic residues. Another bias is also ubiquitous: the amino acid composition is driven by the G + C content of the first codon position. An unexpected bias is driven, in many proteomes, by the AAN box of the genetic code, suggesting some functional biochemical relationship between asparagine and lysine. Less-significant biases are driven by the rare amino acids, cysteine and tryptophan. Some allow identification of species-specific functions or localisation such as surface or exported proteins. Errors in genome annotations are also revealed by correspondence analysis, making it useful for quality control and correction.  相似文献   

12.
Human apolipoprotein (apo) B exists in plasma as two isoproteins designated apoB-100 and apoB-48. ApoB-100 (512 kDa) and apoB-48 (250 kDa) are synthesized by the liver and intestine respectively. Analysis of apoB cDNA clones isolated from a human intestinal cDNA library revealed that the intestinal apoB mRNA contains a new in-frame translational stop codon. This premature stop codon is generated by a single base substitution of a 'C' to 'T' at nucleotide 6538 which converts the codon 'CAA' coding for the amino acid glutamine residue 2153 to an in-frame stop codon 'TAA'. The generation of a stop codon in the intestinal apoB mRNA appears to be tissue specific since it has not been reported in cDNA clones isolated from human liver cDNA libraries which code for the 4536 amino acid apoB-100. A potential polyadenylation signal sequence 'AATAAA' was also identified 390 bases downstream from the new stop codon. The new stop codon in the human intestinal apoB mRNA provides a potential mechanism for the biosynthesis of intestinal apoB-48.  相似文献   

13.
14.
15.
Singer GA  Hickey DA 《Gene》2003,317(1-2):39-47
A number of recent studies have shown that thermophilic prokaryotes have distinguishable patterns of both synonymous codon usage and amino acid composition, indicating the action of natural selection related to thermophily. On the other hand, several other studies of whole genomes have illustrated that nucleotide bias can have dramatic effects on synonymous codon usage and also on the amino acid composition of the encoded proteins. This raises the possibility that the thermophile-specific patterns observed at both the codon and protein levels are merely reflections of a single underlying effect at the level of nucleotide composition. Moreover, such an effect at the nucleotide level might be due entirely to mutational bias. In this study, we have compared the genomes of thermophiles and mesophiles at three levels: nucleotide content, codon usage and amino acid composition. Our results indicate that the genomes of thermophiles are distinguishable from mesophiles at all three levels and that the codon and amino acid frequency differences cannot be explained simply by the patterns of nucleotide composition. At the nucleotide level, we see a consistent tendency for the frequency of adenine to increase at all codon positions within the thermophiles. Thermophiles are also distinguished by their pattern of synonymous codon usage for several amino acids, particularly arginine and isoleucine. At the protein level, the most dramatic effect is a two-fold decrease in the frequency of glutamine residues among thermophiles. These results indicate that adaptation to growth at high temperature requires a coordinated set of evolutionary changes affecting (i) mRNA thermostability, (ii) stability of codon-anticodon interactions and (iii) increased thermostability of the protein products. We conclude that elevated growth temperature imposes selective constraints at all three molecular levels: nucleotide content, codon usage and amino acid composition. In addition to these multiple selective effects, however, the genomes of both thermophiles and mesophiles are often subject to superimposed large changes in composition due to mutational bias.  相似文献   

16.
In vitro selection and directed evolution of peptides from mRNA display are powerful strategies to find novel peptide ligands that bind to target biomolecules. In this study, we expanded the mRNA display method to include multiple nonnatural amino acids by introducing three different four-base codons at a randomly selected single position on the mRNA. Another nonnatural amino acid may be introduced by suppressing an amber codon that may appear from a (NNK)n nucleotide sequence on the mRNA. The mRNA display was expressed in an Escherichia coli in vitro translation system in the presence of three types of tRNAs carrying different four-base anticodons and a tRNA carrying an amber anticodon, the tRNAs being chemically aminoacylated with different nonnatural amino acids. The complexity of the starting mRNA-displayed peptide library was estimated to be 1.1 × 1012 molecules. The effectiveness of the four-base codon mediated mRNA display method was demonstrated in the selection of biocytin-containing peptides on streptavidin-coated beads. Moreover, a novel streptavidin-binding nonnatural peptide containing benzoylphenylalanine was obtained from the nonnatural peptide library. The nonnatural peptide library from the four-base codon mediated mRNA display provides much wider functional and structural diversity than conventional peptide libraries that are constituted from 20 naturally occurring amino acids.  相似文献   

17.
18.
Recently, it has become possible to reprogram the protein synthesis machinery such that numerous noncanonical amino acids can be translated into target sequences yielding tailor-made proteins. The canonical amino acid tryptophan (Trp) encoded by a single nucleotide triplet (UGG) is a particularly interesting target for protein engineering and design. Trp-residues can be substituted with a variety of analogs and surrogates generated biosynthetically or by organic chemistry. Among them, nitrogen-containing tryptophan analogs occupy a central position, as they have distinct chemical properties in comparison with aliphatic amines and imines. They resemble purine bases of DNA and share their capacity for pH-sensitive intramolecular charge transfer. These special properties of the analogs can be directly transmitted into related protein structures via in vivo ribosome-mediated translation. Proteins expressed in this way are further endowed with unique properties like new spectral, altered redox and titration features or might serve as useful biomaterials. We present and discuss current works and future developments in protein engineering with nitrogen-containing tryptophan analogs and related compounds as well as their relevance for academic and applicative research.The term noncanonical amino acid refers to an amino acid that does not belong, in contrast to a canonical amino acid, to the genetically encoded, proteinogenic amino acids. The term analog defines a strict isosteric exchange of a canonical/noncanonical amino acid (e.g., tryptophan/azatryptophan), while the term surrogate defines a nonisosteric change (e.g., tryptophan/azulene). Mutant denotes a protein in which the wild-type sequence was changed by site-directed mutagenesis (codon manipulation on the DNA level) within the repertoire of the standard amino acids. Variant denotes a protein in which one or more canonical amino acids derived from a wild-type or a mutant sequence were replaced by a noncanonical one (expanded amino acid repertoire, codon reassignment on the protein translation level).  相似文献   

19.
Proteins evolve under a myriad of biophysical selection pressures that collectively control the patterns of amino acid substitutions. These evolutionary pressures are sufficiently consistent over time and across protein families to produce substitution patterns, summarized in global amino acid substitution matrices such as BLOSUM, JTT, WAG, and LG, which can be used to successfully detect homologs, infer phylogenies, and reconstruct ancestral sequences. Although the factors that govern the variation of amino acid substitution rates have received much attention, the influence of thermodynamic stability constraints remains unresolved. Here we develop a simple model to calculate amino acid substitution matrices from evolutionary dynamics controlled by a fitness function that reports on the thermodynamic effects of amino acid mutations in protein structures. This hybrid biophysical and evolutionary model accounts for nucleotide transition/transversion rate bias, multi‐nucleotide codon changes, the number of codons per amino acid, and thermodynamic protein stability. We find that our theoretical model accurately recapitulates the complex yet universal pattern observed in common global amino acid substitution matrices used in phylogenetics. These results suggest that selection for thermodynamically stable proteins, coupled with nucleotide mutation bias filtered by the structure of the genetic code, is the primary driver behind the global amino acid substitution patterns observed in proteins throughout the tree of life.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号