Constraining protein sequence space: four amino acid alphabets are sufficient to recapitulate lambda repressor multimerization |
| |
Authors: | Maillet Daniel S Drummond James T |
| |
Affiliation: | Indiana University, Department of Biology and Program in Biochemistry, Bloomington IN 47405, USA |
| |
Abstract: | Nucleic acid polymers selected from random sequence space constitute an enormous array of catalytic, diagnostic and therapeutic molecules. Despite the fact that proteins are robust polymers with far greater chemical and physical diversity, success in unlocking protein sequence space remains elusive. We have devised a combinatorial strategy for accessing nucleic acid sequence space corresponding to proteins comprising selected amino acid alphabets. Using the SynthOMIC approach (synthesis of ORFs by multimerizing in-frame codons), representative libraries comprising four amino acid alphabets were fused in-frame to the lambda repressor DNA-binding domain to provide an in vivo selection for self-interacting proteins that re-constitute lambda repressor function. The frequency of self-interactors as a function of amino acid composition ranged over five orders of magnitude, from ∼6% of clones in a library comprising the amino acid residues LARE to ∼0.6 in 106 in the MASH library. Sequence motifs were evident by inspection in many cases, and individual clones from each library presented substantial sequence identity with translated proteins by BLAST analysis. We posit that the SynthOMIC approach represents a powerful strategy for creating combinatorial libraries of open reading frames that distils protein sequence space on the basis of three inherent properties: it supports the use of selected amino acid alphabets, eliminates redundant sequences and locally constrains amino acids. |
| |
Keywords: | ORF, open reading frame LA, limited alphabet PEG, polyethylene glycol |
本文献已被 ScienceDirect PubMed 等数据库收录! |
|