首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Protein engineering by inserting stretches of random DNA sequences into target genes in combination with adequate screening or selection methods is a versatile technique to elucidate and improve protein functions. Established compounds for generating semi-random DNA sequences are spiked oligonucleotides which are synthesised by interspersing wild type (wt) nucleotides of the target sequence with certain amounts of other nucleotides. Directed spiking strategies reduce the complexity of a library to a manageable format compared with completely random libraries. Computational algorithms render feasible the calculation of appropriate nucleotide mixtures to encode specified amino acid subpopulations. The crucial element in the ranking of spiked codons generated during an iterative algorithm is the scoring function. In this report three scoring functions are analysed: the sum-of-square-differences function s, a modified cubic function c, and a scoring function m derived from maximum likelihood considerations. The impact of these scoring functions on calculated amino acid distributions is demonstrated by an example of mutagenising a domain surrounding the active site serine of subtilisin-like proteases. At default weight settings of one for each amino acid, the new scoring function m is superior to functions s and c in finding matches to a given amino acid population.  相似文献   

2.
3.
The cDNA clones encoding the precursor form of glycinin A3B4 subunit have been identified from a library of soybean cotyledonary cDNA clones in the plasmid pBR322 by a combination of differential colony hybridizations, and then by immunoprecipitation of hybrid-selected translation product with A3-mono-specific antiserum. A recombinant plasmid, designated pGA3B41425, from one of six clones covering codons for the NH2-terminal region of the subunit was sequenced, and the amino acid sequence was inferred from the nucleotide sequence, which showed that the mRNA codes for a precursor protein of 516 amino acids. Analysis of this cDNA also showed that it contained 1786 nucleotides of mRNA sequence with a 5'-terminal nontranslated region of 46 nucleotides, a signal peptide region corresponding to 24 amino acids, an A3 acidic subunit region corresponding to 320 amino acids followed by a B4 basic subunit region corresponding to 172 amino acids, and a 3'-terminal nontranslated region of 192 nucleotides, which contained two characteristic AAUAAA sequences that ended 110 nucleotides and 26 nucleotides from a 3'-terminal poly(A) segment, respectively. Our results confirm that glycinin is synthesized as precursor polypeptides which undergo post-translational processing to form the nonrandom polypeptide pairs via disulfide bonds. The inferred amino acid sequence of the mature basic subunit, B4, was compared to that of the basic subunit of pea legumin, Leg Beta, which contained 185 amino acids. Using an alignment that permitted a maximum homology of amino acids, it was found that overall 42% of the amino acid positions are identical in both proteins. These results led us to conclude that both storage proteins have a common ancestor.  相似文献   

4.
Protein combinatorial libraries provide new ways to probe the determinants of folding and to discover novel proteins. Such libraries are often constructed by expressing an ensemble of partially random gene sequences. Given the intractably large number of possible sequences, some limitation on diversity must be imposed. A non-uniform distribution of nucleotides can be used to reduce the number of possible sequences and encode peptide sequences having a predetermined set of amino acid probabilities at each residue position, i.e., the amino acid sequence profile. Such profiles can be determined by inspection, multiple sequence alignment or physically-based computational methods. Here we present a computational method that takes as input a desired sequence profile and calculates the individual nucleotide probabilities among partially random genes. The calculated gene library can be readily used in the context of standard DNA synthesis to generate a protein library with essentially the desired profile. The fidelity between the desired profile and the calculated one coded by these partially random genes is quantitatively evaluated using the linear correlation coefficient and a relative entropy, each of which provides a measure of profile agreement at each position of the sequence. On average, this method of identifying such codon frequencies performs as well or better than other methods with regard to fidelity to the original profile. Importantly, the method presented here provides much better yields of complete sequences that do not contain stop codons, a feature that is particularly important when all or large fractions of a gene are subject to combinatorial mutation.  相似文献   

5.
The complete nucleotide sequence of the maize chlorotic mottle virus (MCMV) genome has been determined to be 4437 nucleotides. The viral genome has four long open reading frames (ORFs) which could encode polypeptides of 31.6, 50, 8.9 and 25.1 kd. If the termination codons, for the polypeptides encoded by the 50 and 8.9 kd ORFs are suppressed, readthrough products of 111 and 32.7 kd result. The 31.6 and 50 kd ORFs overlap for nearly the entire length of the 31.6 kd ORF. Striking amino acid homology has been observed between two potential polypeptides encoded by MCMV and polypeptides encoded by carnation mottle virus (CarMV) and turnip crinkle virus (TCV). The 25.1 kd ORF most likely encodes the capsid protein. The similar genome organization and amino acid sequence homology of MCMV with CarMV and TCV suggest an evolutionary relationship with these members of the carmovirus group.  相似文献   

6.
Human reovirus serotype 1 Lang strain s2 mRNA, which encodes the virion inner capsid core polypeptide sigma 2, was cloned as a cDNA:mRNA heteroduplex in Escherichia coli using phage M13. A complete consensus nucleotide sequence was determined. The Lang strain s2 mRNA is 1331 nucleotides in length and possesses an open reading frame with a coding capacity of 335 amino acids, sufficient to account for a sigma 2 polypeptide of 37,682 daltons. Comparison of the serotype 1 Lang s2 sequence derived from cDNA clones of s2 mRNA with the serotype 3 Dearing S2 sequence derived from cDNA clones of the S2 dsRNA genome segment reveals 86 percent homology at the nucleotide level. The predicted sigma 2 polypeptides of the Lang and Dearing strains display 98 percent homology at the amino acid level. Of 147 silent nt differences in the translated region, 136 were in the third base position of codons.  相似文献   

7.
Serotype 1 Lang strain s4 mRNA, which encodes the major capsid surface polypeptide sigma 3 of reovirions, was cloned as a cDNA:mRNA heteroduplex in Escherichia coli using phage M13. A complete consensus nucleotide sequence for s4 mRNA has been determined from cDNA clones. The Lang strain s4 mRNA is 1196 nucleotides in length and possesses an open reading frame with a coding capacity of 365 amino acids, sufficient to account for a sigma 3 polypeptide of 41,212 daltons. Comparison of the serotype 1 (Lang) s4 sequence with the serotype 3 (Dearing) s4 sequence reveals 94% homology at the nucleotide level; the predicted sigma 3 polypeptides of the Lang and Dearing strains display 96% homology at the amino acid level. Two third base C codons (leu:CUC and ser:AGC) are used about one-tenth as frequently in the reovirus s4 mRNAs as compared to mammalian cellular mRNAs.  相似文献   

8.
Cloned DNA copies of rotavirus genomic segment 6 from simian 11 (subgroup 1) and human strain Wa (subgroup 2) rotaviruses have been used to determine the nucleotide sequences of the gene that determines viral subgroup specificity. Both genomic segments are 1,356 nucleotides in length and possess 5'- and 3'-terminal untranslated regions of 23 and 142 nucleotides, respectively. The inferred amino acid sequence reveals VP6 to be a polypeptide of 397 amino acids in which more than 90% of the amino acid sequence is conserved between the two viruses. There are 34 amino acid changes between the subgroup 1 and 2 polypeptides, most clustered in three regions of the molecule at residues 39 through 62, 80 through 122, and 281 through 315.  相似文献   

9.
The Escherichia coli gene coding for the enzyme xanthine-guanine phosphoribosyl transferase (gpt) has been widely used as a dominant selectable marker in a variety of mammalian cells. We have determined the complete nucleotide sequence of the 1057 base pair (bp) segment of DNA containing this gene. The coding sequence for the enzyme is 456 nucleotides long and can code for a 152 amino acid (16.9 Kd) polypeptide. A comparison of the amino acid sequence of the bacterial enzyme with that of the mammalian hypoxanthine-guanine phosphoribosyl transferase (hprt) reveals no significant homology between the two polypeptides.  相似文献   

10.
Middle component RNA (M RNA) of cowpea mosaic virus (CPMV) was transcribed into cDNA and double-stranded cDNA was inserted into the EcoRI site of plasmid pBRH2. The nucleotide sequence of inserts was determined, after subcloning in bacteriophages M13mp7, M13mp8 or M13mp9, by the dideoxy chain termination method. The complete sequence of CPMV M RNA, up to the poly(A) tail, is 3481 nucleotides long. The sequence contains a long open reading frame starting at nucleotide 161 from the 5' terminus and continuing to 180 nucleotides from the 3' terminus. The sequence does not contain a polyadenylation signal for the poly(A) tail at the 3' end of CPMV RNA. The initiation site at position 161 together with AUG codons in the same reading frame at positions 512 and/or 524 account for the two large colinear precursor polypeptides translated in vitro from M RNA. The amino acid sequence deduced from the nucleotide sequence suggests that both precursor polypeptides are proteolytically cleaved at glutaminyl-methionine and glutaminyl-glycine, respectively, to produce the two viral capsid proteins.  相似文献   

11.
Two major chloroplast proteins are encoded by nuclear genes and synthesized on free cytoplasmic ribosomes: the small subunit of ribulose 1,5-bisphosphate carboxylase and the apoprotein components of the chlorophyll a/b light harvesting complex. We have recently reported the isolation of two cDNA clones from pea which encode both the small subunit of ribulose 1,5-bisphosphate carboxylase (pSS15) and the polypeptide 15 (pAB96), the major chlorophyll a/b binding protein (Broglie, R., Bellemare, G., Bartlett, S., Chua, N.-H., and Cashmore, A. R. (1981) Proc. Natl. Acad. Sci. U.S.A. 78, 7304-7308). To further characterize these clones, we determined their nucleotide sequence. Clone pSS15 contains a 691-base pair cDNA insert which encodes the entire 123 amino acids of the mature small subunit protein. In addition, this clone also encodes 33 amino acids of the NH2-terminal transit peptide extension and 148 nucleotides of the 3' noncoding region preceding the poly(A)tail. A second cDNA clone (pAB96) contains an 833-nucleotide insert which encodes most of polypeptide 15. The DNA sequence of this cloned cDNA was used to deduce the previously undetermined amino acid sequence of this integral thylakoid membrane protein. The nucleotide sequence of the cDNA clone, pSS15, should provide information concerning the role of the transit sequence in the transport of cytoplasmically synthesized chloroplast proteins. Similarly, the deduced amino acid sequence of polypeptide 15 will provide information for predicting its orientation in thylakoid membranes as well as its role in binding chlorophyll.  相似文献   

12.
13.
The design and rapid construction of libraries of genes coding beta-sheet forming repetitive and block-copolymerized polypeptides bearing various C- and N-terminal sequences are described. The design was based on the assembly of DNA cassettes coding for the (GA)3GX amino acid sequence where the (GAGAGA) sequences would constitute the beta-strand units of a larger beta-sheet assembly. The edges of this beta-sheet would be functionalized by the turn-inducing amino acids (GX). The polypeptides were expressed in Escherichia coli using conventional vectors and were purified by Ni-nitriloacetic acid (NTA) chromatography. The correlation of polymer structure with molecular weight was investigated by gel electrophoresis and mass spectrometry. The monomer sequences and post-translational chemical modifications were found to influence the mobility of the polypeptides over the full range of polypeptide molecular weights while the electrophoretic mobility of lower molecular weight polypeptides was more susceptible to C- and N-termini polypeptide modifications.  相似文献   

14.
15.
The open reading frame (ORF) that encodes the 226-amino-acid coat protein (hepatitis B virus surface antigen [HBsAg]) of hepatitis B virus has the potential to encode a 400-amino-acid polypeptide. The entire ORF would direct the synthesis of a polypeptide whose C-terminal amino acids represent HBsAg with an additional 174 amino acids at the N terminus (pre-s). Recently, virus particles have been shown to contain a polypeptide that corresponds to HBsAg with an additional 55 amino acids at the N terminus encoded by the DNA sequence immediately upstream of the HBsAg gene. A novel ORF expression vector containing the TAC promoter, the first eight codons of the gene for beta-galactosidase, and the entire coding sequence for chloramphenicol acetyltransferase was used in bacteria to express determinants of the 174 amino acids predicted from the pre-s portion of the ORF. The resulting tribrid protein containing 108 amino acids encoded by pre-s was expressed as one of the major proteins of bacteria harboring the recombinant plasmid. Single-step purification of the tribrid fusion protein was achieved by fractionation on a chloramphenicol affinity resin. Polyclonal antiserum generated to the fusion protein was capable of detecting 42- and 46-kilodalton polypeptides from virus particles; both polypeptides were also shown to contain HBsAg determinants. The ability of the polyclonal antiserum to identify polypeptides with these characteristics from virus particles presents compelling evidence that the DNA sequence of the entire ORF is expressed as a contiguous polypeptide containing HBsAg. The presence of multiple promoters and primary translation products from this single ORF argues that the function and potential interaction of the encoded polypeptides play a crucial role in the life cycle of the virus. Furthermore, the procedure and vector described in this report can be applied to other systems to facilitate the generation of antibodies to defined determinants and should allow the characterization of the epitope specificity of existing antibodies.  相似文献   

16.
The complete DNA sequence coding for the immediate-early protein (IE180) of pseudorabies virus was determined. The coding region of IE180 is 4380 nucleotides for 1460 amino acid residues. G+C content of the non-coding portion of the IE gene is 70.3% while the G+C content of the coding portion is considerably higher at 80.1%. Correspondingly, codons consisting mainly of Gs and Cs are favoured. Clusters of amino acid homologies are observed among IE180 of pseudorabies virus, ICP4 of herpes simplex virus type-1 and IE140 of varicella-zoster virus, and are organized similarly in all three polypeptides. Functions exhibited by IE180 are assigned, tentatively, to structural domains of the molecule by analogy to the HSV-1 ICP4 polypeptide.  相似文献   

17.
The genes encoding carbamoylphosphate synthetase from Pseudomonas aeruginosa PAO1 were cloned in Escherichia coli. Deletion and transposition analysis determined the locations of carA, encoding the small subunit, and carB, encoding the large subunit, on the chromosomal insert. The nucleotide sequence of carA and the flanking regions was determined. The derived amino acid sequence for the small subunit of carbamoylphosphate synthetase from P. aeruginosa exhibited 68% homology with its counterparts in E. coli and Salmonella typhimurium. The derived sequences in the three organisms were essentially identical in the three polypeptide segments that are conserved in glutamine amidotransferases but showed low homology at the amino- and carboxy-terminal regions. The amino-terminal amino acid sequences were determined for the large and small subunits. The first 15 amino acids of the large subunit were identical to those derived from the carB sequence. However, comparison of the derived sequence for carA with the amino-terminal amino acid sequence for the small subunit suggested that codons 5 to 8 are not translated. The DNA sequence for the region encompassing these four codons was confirmed by direct sequencing of chromosomal DNA after amplification by the polymerase chain reaction. The mRNA sequence was also deduced by in vitro synthesis of cDNA, enzymatic amplification, and sequencing, confirming that 12 nucleotides in the 5' terminal of carA are transcribed but are not translated.  相似文献   

18.
The gene coding for cyclohexanone monooxygenase from Acinetobacter sp. strain NCIB 9871 was isolated by immunological screening methods. We located and determined the nucleotide sequence of the gene. The structural gene is 1,626 nucleotides long and codes for a polypeptide of 542 amino acids; 389 nucleotides 5' and 108 nucleotides 3' of the coding region are also reported. The complete amino acid sequence of the enzyme was derived by translation of the nucleotide sequence. From a comparison of the amino acid sequence with consensus sequences of nucleotide-binding folds, we identified a potential flavin-binding site at the NH2 terminus of the enzyme (residues 6 to 18) and a potential nicotinamide-binding site extending from residue 176 to residue 208 of the protein. An overproduction system for the gene to facilitate genetic manipulations was also constructed by using the tac promoter vector pKK223-3 in Escherichia coli.  相似文献   

19.
20.
A 693 basepair cloned fragment of bacteriophage T4 DNA, which supports specifically growth of T4 amber mutants in gene 57, has been sequenced. A polypeptide can be deduced from this sequence, that is either 54 or 60 amino acids long depending which of two AUG codons, 18 nucleotides apart, are used for initiation. The size of this deduced polypeptide is compatible with the size of a single polypeptide (based on polyacrylamide gel electrophoresis) synthesized in vivo in E. coli under the direction of the cloned T4 DNA fragment.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号