首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 187 毫秒
1.
The product of a kanamycin resistance gene encoded by plasmid pTB913 isolated from a thermophilic bacillus was identified as a kanamycin nucleotidyltransferase which is similar to that encoded by plasmid pUB110 from a mesophile, Staphylococcus aureus. The enzyme encoded by pTB913 was more thermostable than that encoded by pUB110. In view of a close resemblance of restriction endonuclease cleavage maps around the BglII site in the structural genes of both enzymes, ca. 1,200 base pairs were sequenced, followed by amino-terminal amino acid sequencing of the enzyme. The two nucleotide sequences were found to be identical to each other except for only one base in the midst of the structural gene. Each structural gene, initiating from a GUG codon as methionine, was composed of 759 base pairs and 253 amino acid residues (molecular weight, ca. 29,000). The sole difference was transversion from a cytosine (pUB110) to an adenine (pTB913) at a position + 389, counting the first base of the initiation codon as + 1. That is, a threonine at position 130 for the pUB110-coded kanamycin nucleotidyltransferase was replaced by a lysine for the pTB913-coded enzyme. The difference in thermostability between the two enzymes caused by a single amino acid replacement is discussed in light of electrostatic effects.  相似文献   

2.
The evolutionary selection forces acting on a protein are commonly inferred using evolutionary codon models by contrasting the rate of synonymous to nonsynonymous substitutions. Most widely used models are based on theoretical assumptions and ignore the empirical observation that distinct amino acids differ in their replacement rates. In this paper, we develop a general method that allows assimilation of empirical amino acid replacement probabilities into a codon-substitution matrix. In this way, the resulting codon model takes into account not only the transition-transversion bias and the nonsynonymous/synonymous ratio, but also the different amino acid replacement probabilities as specified in empirical amino acid matrices. Different empirical amino acid replacement matrices, such as secondary structure-specific matrices or organelle-specific matrices (e.g., mitochondria and chloroplasts), can be incorporated into the model, making it context dependent. Using a diverse set of coding DNA sequences, we show that the novel model better fits biological data as compared with either mechanistic or empirical codon models. Using the suggested model, we further analyze human immunodeficiency virus type 1 protease sequences obtained from drug-treated patients and reveal positive selection in sites that are known to confer drug resistance to the virus.  相似文献   

3.
4.
Genetic distance and electrophoretic identity of proteins between taxa   总被引:11,自引:0,他引:11  
Summary The relationship between amino acid substitution and charge change of proteins in the evolutionary process is studied by using a stochastic model. A mathematical formula is developed for the electrophoretic identity of proteins between two different taxa for a given number of average codon differences per protein locus. Using this formula, a reference figure is constructed for estimating the average number of codon differences per locus between taxa.  相似文献   

5.
Miyazawa S 《PloS one》2011,6(12):e28892
BACKGROUND: A mechanistic codon substitution model, in which each codon substitution rate is proportional to the product of a codon mutation rate and the average fixation probability depending on the type of amino acid replacement, has advantages over nucleotide, amino acid, and empirical codon substitution models in evolutionary analysis of protein-coding sequences. It can approximate a wide range of codon substitution processes. If no selection pressure on amino acids is taken into account, it will become equivalent to a nucleotide substitution model. If mutation rates are assumed not to depend on the codon type, then it will become essentially equivalent to an amino acid substitution model. Mutation at the nucleotide level and selection at the amino acid level can be separately evaluated. RESULTS: The present scheme for single nucleotide mutations is equivalent to the general time-reversible model, but multiple nucleotide changes in infinitesimal time are allowed. Selective constraints on the respective types of amino acid replacements are tailored to each gene in a linear function of a given estimate of selective constraints. Their good estimates are those calculated by maximizing the respective likelihoods of empirical amino acid or codon substitution frequency matrices. Akaike and Bayesian information criteria indicate that the present model performs far better than the other substitution models for all five phylogenetic trees of highly-divergent to highly-homologous sequences of chloroplast, mitochondrial, and nuclear genes. It is also shown that multiple nucleotide changes in infinitesimal time are significant in long branches, although they may be caused by compensatory substitutions or other mechanisms. The variation of selective constraint over sites fits the datasets significantly better than variable mutation rates, except for 10 slow-evolving nuclear genes of 10 mammals. An critical finding for phylogenetic analysis is that assuming variable mutation rates over sites lead to the overestimation of branch lengths.  相似文献   

6.
Colias eurytheme butterflies display extensive allozyme polymorphism in the enzyme phosphoglucose isomerase (PGI). Earlier studies on biochemical and fitness effects of these genotypes found evidence of strong natural selection maintaining this polymorphism in the wild. Here we analyze the molecular features of this polymorphism by sequencing multiple alleles and modeling their structures. PGI is a dimer with rotational symmetry. Each monomer provides a critical residue to the other monomer's catalytic center. Sequenced alleles differ at multiple amino acid positions, including cryptic charge-neutral variation, but most consistent differences among the electromorph alleles are at the charge-changing amino acid sites. Principal candidate sites of selection, identified by structural and functional analyses and by their variants' population frequencies, occur in interpenetrating loops across the interface between monomers, where they may alter subunit interactions and catalytic center geometry. Comparison to a second (and basal) species, Colias meadii, also polymorphic for PGI under natural selection, reveals one fixed amino acid difference between their PGIs, which is located in the interpenetrating loop and accompanies functional differences among their variants. We also study nucleotide variability among the PGI alleles, comparing these data to similar data from another glycolytic enzyme gene, glyceraldehyde-3-phosphate dehydrogenase. Despite extensive nonsynonymous and synonymous polymorphism at PGI in each species, the only base changes fixed between species are the two causing the amino acid replacement; this absence of synonymous fixation yields a significant McDonald-Kreitman test. Analyses of these data suggest historical population expansion. Positive peaks of Tajima's D statistic, representing regions of neutral "hitchhiking," are found around the principal candidate sites of selection. This study provides novel views of molecular-structural mechanisms, and beginnings of historical evidence, for a long-persistent balanced enzyme polymorphism at PGI in these and perhaps other species.  相似文献   

7.
A lambda gtll cDNA library prepared from human liver poly(A) RNA has been screened with affinity-purified antibody to human factor XI, a blood coagulation factor composed of two identical polypeptide chains linked by a disulfide bond(s). A cDNA insert coding for factor XI was isolated and shown to contain 2097 nucleotides, including 54 nucleotides coding for a leader peptide of 18 amino acids and 1821 nucleotides coding for 607 amino acids that are present in each of the 2 chains of the mature protein. The cDNA for factor XI also contained a stop codon (TGA), a potential polyadenylation or processing sequence (AACAAA), and a poly(A) tail at the 3' end. Five potential N-glycosylation sites were found in each of the two chains of factor XI. The cleavage site for the activation of factor XI by factor XIIa was identified as an internal peptide bond between Arg-369 and Ile-370 in each polypeptide chain. This was based upon the amino acid sequence predicted by the cDNA and the amino acid sequence previously reported for the amino-terminal portion of the light chain of factor XI. Each heavy chain of factor XIa (369 amino acids) was found to contain 4 tandem repeats of 90 (or 91) amino acids plus a short connecting peptide. Each repeat probably forms a separate domain containing three internal disulfide bonds. The light chains of factor XIa (each 238 amino acids) contain the catalytic portion of the enzyme with sequences that are typical of the trypsin family of serine proteases. The amino acid sequence of factor XI shows 58% identity with human plasma prekallikrein.  相似文献   

8.
Summary The complete nucleotide sequence of the Salmonella strain LT2 gnd gene for 6-phosphogluconate dehydrogenase was determined. The gene contains 1404 bases and encodes a 468 amino acid polypeptide, which is the same as for Escherichia coli K12. The DNA sequence shows 14.8% difference between the two and the amino acid sequence 3.6% difference. Changes are mostly in the third codon base and most of the amino acid changes are conservative.  相似文献   

9.
J. P. Carulli  D. L. Hartl 《Genetics》1992,132(1):193-204
DNA sequences and chromosomal locations of four Drosophila pseudoobscura opsin genes were compared with those from Drosophila melanogaster, to determine factors that influence the evolution of multigene families. Although the opsin proteins perform the same primary functions, the comparisons reveal a wide range of evolutionary rates. Amino acid identities for the opsins range from 90% for Rh2 to more than 95% for Rh1 and Rh4. Variation in the rate of synonymous site substitution is especially striking: the major opsin, encoded by the Rh1 locus, differs at only 26.1% of synonymous sites between D. pseudoobscura and D. melanogaster, while the other opsin loci differ by as much as 39.2% at synonymous sites. Rh3 and Rh4 have similar levels of synonymous nucleotide substitution but significantly different amounts of amino acid replacement. This decoupling of nucleotide substitution and amino acid replacement suggests that different selective pressures are acting on these similar genes. There is significant heterogeneity in base composition and codon usage bias among the opsin genes in both species, but there are no consistent relationships between these factors and the rate of evolution of the opsins. In addition to exhibiting variation in evolutionary rates, the opsin loci in these species reveal rearrangements of chromosome elements.  相似文献   

10.
M A Soto  C J Tohá 《Bio Systems》1985,18(2):209-215
A quantitative rationale for the evolution of the genetic code is developed considering the principle of minimal hardware. This principle defines an optimal code as one that minimizes for a given amount of information encoded, the product of the number of physical devices used by the average complexity of each device. By identifying the number of different amino acids, number of nucleotide positions per codon and number of base types that can occupy each such position with, respectively, the amount of information, number of devices and the complexity, we show that optimal codes occur for 3, 7 and 20 amino acids with codons having a single, two and three base positions per codon, respectively. The advantage of a code of exactly 4 symbols is deduced, as well as a plausible evolutionary pathway from a code of doublets to triplets. The present day code of 20 amino acids encoded by 64 codons is shown to be the most optimal in an absolute sense. Using a tetraplet code further evolution to a code in which there would be 55 amino acids is in principle possible, but such a code would deviate slightly more than the present day code from the minimal hardware configuration. The change from a triplet code to a tetraplet code would occur at about 32 amino acids. Our conclusions are independent of, but consistent with, the observed physico-chemical properties of the amino acids and codon structures. These correlations could have evolved within the constrains imposed by the minimal hardware principle.  相似文献   

11.
This paper analyzes the nucleotide sequences of three viruses: Kunjin, west Nile, and yellow fever. Each virus has one long open reading frame of greater than 10,200 nucleotides that codes for four structural and seven nonstructural genes. The Kunjin and west Nile viruses are the most closely related pair, when assessed on the basis of matches between their nucleotide sequences. As would be expected, the matching is least for bases at third-position codon sites and is greatest for second-position sites. Statistics are presented for the numbers of mismatches that are transitions or transversions. Nucleotide base usage is also reported. To each of the 33 virus-gene segments, nonhomogeneous Markov chain models have been fitted to describe the sequences of nucleotide bases. The models allow for different transition probabilities ("transition" is used in the mathematical sense here) and for different degrees of dependency, at the three sites in the codons. Reasonably satisfactory fits can be obtained for many of the genes by using models that are first order for both first- and second-position sites in the codon but that are second order for third-position sites. One consequence of such a model is that the correlation between one amino acid and the next is limited to the correlation of the last base of the former with the first base of the latter. Other consequences are that the model can (and does) prohibit the occurrence of stop codons within a gene and that subsequences of only first-position bases, or only third-position bases, are also first-order Markov chains. In theory, second-position subsequences may not be Markov chains at all. In practice, the data suggest that each of these subsequences is effectively a zero-order Markov chain, i.e., bases spaced three apart are statistically independent. Stationarity of nucleotide base distributions can be interpreted in either of two ways: (1) spatially along the sites or (2) temporally at each site. These interpretations must often be inconsistent, when the former allows for Markov dependence between adjacent sites whereas the latter assumes independence between sites. The inconsistency can be overcome, for these viruses, if subsequences at different codon positions are analyzed separately.  相似文献   

12.
Summary Some simple formulae were obtained which enable us to estimate evolutionary distances in terms of the number of nucleotide substitutions (and, also, the evolutionary rates when the divergence times are known). In comparing a pair of nucleotide sequences, we distinguish two types of differences; if homologous sites are occupied by different nucleotide bases but both are purines or both pyrimidines, the difference is called type I (or transition type), while, if one of the two is a purine and the other is a pyrimidine, the difference is called type II (or transversion type). Letting P and Q be respectively the fractions of nucleotide sites showing type I and type II differences between two sequences compared, then the evolutionary distance per site is K = — (1/2) ln {(1 — 2P — Q) }. The evolutionary rate per year is then given by k = K/(2T), where T is the time since the divergence of the two sequences. If only the third codon positions are compared, the synonymous component of the evolutionary base substitutions per site is estimated by K'S = — (1/2) ln (1 — 2P — Q). Also, formulae for standard errors were obtained. Some examples were worked out using reported globin sequences to show that synonymous substitutions occur at much higher rates than amino acid-altering substitutions in evolution.Contribution No. 1330 from the National Institute of Genetics, Mishima, 411 Japan  相似文献   

13.
Wall DP  Herbeck JT 《Journal of molecular evolution》2003,56(6):673-88; discussion 689-90
In this study we reconstruct the evolution of codon usage bias in the chloroplast gene rbcL using a phylogeny of 92 green-plant taxa. We employ a measure of codon usage bias that accounts for chloroplast genomic nucleotide content, as an attempt to limit plausible explanations for patterns of codon bias evolution to selection- or drift-based processes. This measure uses maximum likelihood-ratio tests to compare the performance of two models, one in which a single codon is overrepresented and one in which two codons are overrepresented. The measure allowed us to analyze both the extent of bias in each lineage and the evolution of codon choice across the phylogeny. Despite predictions based primarily on the low G + C content of the chloroplast and the high functional importance of rbcL, we found large differences in the extent of bias, suggesting differential molecular selection that is clade specific. The seed plants and simple leafy liverworts each independently derived a low level of bias in rbcL, perhaps indicating relaxed selectional constraint on molecular changes in the gene. Overrepresentation of a single codon was typically plesiomorphic, and transitions to overrepresentation of two codons occurred commonly across the phylogeny, possibly indicating biochemical selection. The total codon bias in each taxon, when regressed against the total bias of each amino acid, suggested that twofold amino acids play a strong role in inflating the level of codon usage bias in rbcL, despite the fact that twofolds compose a minority of residues in this gene. Those amino acids that contributed most to the total codon usage bias of each taxon are known through amino acid knockout and replacement to be of high functional importance. This suggests that codon usage bias may be constrained by particular amino acids and, thus, may serve as a good predictor of what residues are most important for protein fitness.  相似文献   

14.
The GC contents of 2670 prokaryotic genomes that belong to diverse phylogenetic lineages were analyzed in this paper. These genomes had GC contents that ranged from 13.5% to 74.9%. We analyzed the distance of base frequencies at the three codon positions, codon frequencies, and amino acid compositions across genomes with respect to the differences in the GC content of these prokaryotic species. We found that although the phylogenetic lineages were remote among some species, a similar genomic GC content forced them to adopt similar base usage patterns at the three codon positions, codon usage patterns, and amino acid usage patterns. Our work demonstrates that in prokaryotic genomes: a) base usage, codon usage, and amino acid usage change with GC content with a linear correlation; b) the distance of each usage has a linear correlation with the GC content difference; and c) GC content is more essential than phylogenetic lineage in determining base usage, codon usage, and amino acid usage. This work is exceptional in that we adopted intuitively graphic methods for all analyses, and we used these analyses to examine as many as 2670 prokaryotes. We hope that this work is helpful for understanding common features in the organization of microbial genomes.  相似文献   

15.
A DNA fragment including most of the tyrA gene from E. coli B/r strain WU (Tyr-, Leu-) was amplified in vitro by polymerase chain reaction. The sequence was determined, first, for essentially all of the fragment to locate an ochre nonsense defect, and second, repeatedly for a region of the fragment from several independent isolates containing backmutations at the ochre codon (spontaneous and UV-induced). There were 20 single base differences in the tyrA gene region from the analogous wild-type E. coli K12 sequence: an ochre codon at amino acid position 161, 18 silent changes (1 at the first codon base and 17 at the third) and one replacement of valine by alanine. Different backmutations at the ochre codon encoded lysine, glutamine, glutamic acid, leucine, cysteine, phenylalanine, serine or tyrosine. The diversities of base substitutions at the ochre codon after UV mutagenesis or after mutagenesis where targeting by dimers was reduced or eliminated (after photoreversal of irradiated cells treated with nalidixic acid to induce SOS functions or after UV mutagenesis of cells containing amplified DNA photolyase) were similar (with two notable exceptions). The overall differences between the gene sequences for E. coli K12 or B/r seemed consistent with the neutral theory of molecular evolution.  相似文献   

16.
Suzuki Y  Gojobori T 《Gene》2001,276(1-2):83-87
To predict the amino acid sites important for the clearance of hepatitis C virus (HCV) subtype 1b in vivo, positively selected amino acid sites were detected by analyzing the sequence data collected from the international DNA databank. The rate of nonsynonymous substitutions per nonsynonymous site was compared with that of synonymous substitutions per synonymous site for each codon site in the entire coding region. As a result, 13 out of 3010 amino acid sites were found to be positively selected. Among the 13 positively selected amino acid sites, eight were located in the structural proteins and five were in the nonstructural proteins. Moreover, eight were located in B-cell epitopes and two were in T-cell epitopes. These observations suggest that both the antibody and the cytotoxic T lymphocyte are involved in the clearance of HCV subtype 1b in vivo. These positively selected amino acid sites represent candidate vaccination targets for HCV subtype 1b.  相似文献   

17.
18.
Highly expressed plastid genes display codon adaptation, which is defined as a bias toward a set of codons which are complementary to abundant tRNAs. This type of adaptation is similar to what is observed in highly expressed Escherichia coli genes and is probably the result of selection to increase translation efficiency. In the current work, the codon adaptation of plastid genes is studied with regard to three specific features that have been observed in E. coli and which may influence translation efficiency. These features are (1) a relatively low codon adaptation at the 5′ end of highly expressed genes, (2) an influence of neighboring codons on codon usage at a particular site (codon context), and (3) a correlation between the level of codon adaptation of a gene and its amino acid content. All three features are found in plastid genes. First, highly expressed plastid genes have a noticeable decrease in codon adaptation over the first 10–20 codons. Second, for the twofold degenerate NNY codon groups, highly expressed genes have an overall bias toward the NNC codon, but this is not observed when the 3′ neighboring base is a G. At these sites highly expressed genes are biased toward NNT instead of NNC. Third, plastid genes that have higher codon adaptations also tend to have an increased usage of amino acids with a high G + C content at the first two codon positions and GNN codons in particular. The correlation between codon adaptation and amino acid content exists separately for both cytosolic and membrane proteins and is not related to any obvious functional property. It is suggested that at certain sites selection discriminates between nonsynonymous codons based on translational, not functional, differences, with the result that the amino acid sequence of highly expressed proteins is partially influenced by selection for increased translation efficiency. Received: 21 July 1999 / Accepted: 5 November 1999  相似文献   

19.
Two fetal globin genes (G gamma and A gamma) from one chromosome of a lowland gorilla (Gorilla gorilla gorilla) have been sequenced and compared to three human loci (a G gamma-gene and two A gamma-alleles). A comparison of regions of local homology among these five sequences indicates that long after the duplication that produced the two nonallelic gamma-globin loci of catarrhine primates, about 35 million years (Myr) ago, at least one gene conversion event occurred between these loci. This conversion occurred not long before the ancestral divergence (about 6 Myr ago) of Homo and Gorilla. After this ancestral divergence, a minimum of three more gene conversion events occurred in the human lineage. Each human A gamma-allele shares specific sequence features with the gorilla A gamma-gene; one such distinctive allelic feature involves the simple repeated sequence in IVS 2. This suggests that early in the human lineage the A gamma-genes may have undergone a crossing-over event mediated by this simple repeated sequence. The DNA sequences from coding regions of both G gamma- and A gamma-loci, a comparison of 292 codons in the corresponding gorilla and human genes, show an unusually low evolutionary rate, with only two nonsilent differences and, surprisingly, not even one silent substitution. The two nonsynonymous substitutions observed predict a glycine at codon 73 and an arginine at codon 104 in the gorilla A gamma-sequence rather than aspartic acid and lysine, respectively, in human A gamma. Because only arginine has been found at position 104 in gamma-chains of Old World monkeys, it may represent the ancestral residue lost in gorilla and human G gamma-chains and in the human A gamma-chain. Possibly the arginine codon (AGG) was replaced by the lysine codon (AAG) in the G gamma-gene of a common ancestor of Homo and Gorilla and then was transferred to the A gamma-gene by subsequent conversions in the human lineage. DNA sequence conversions, similar to that attributed to the fetal gamma-globin genes, appear to be relatively frequent phenomena and, if widespread throughout the genome, may have profound evolutionary consequences.   相似文献   

20.
Models of amino acid substitution were developed and compared using maximum likelihood. Two kinds of models are considered. "Empirical" models do not explicitly consider factors that shape protein evolution, but attempt to summarize the substitution pattern from large quantities of real data. "Mechanistic" models are formulated at the codon level and separate mutational biases at the nucleotide level from selective constraints at the amino acid level. They account for features of sequence evolution, such as transition-transversion bias and base or codon frequency biases, and make use of physicochemical distances between amino acids to specify nonsynonymous substitution rates. A general approach is presented that transforms a Markov model of codon substitution into a model of amino acid replacement. Protein sequences from the entire mitochondrial genomes of 20 mammalian species were analyzed using different models. The mechanistic models were found to fit the data better than empirical models derived from large databases. Both the mutational distance between amino acids (determined by the genetic code and mutational biases such as the transition-transversion bias) and the physicochemical distance are found to have strong effects on amino acid substitution rates. A significant proportion of amino acid substitutions appeared to have involved more than one codon position, indicating that nucleotide substitutions at neighboring sites may be correlated. Rates of amino acid substitution were found to be highly variable among sites.   相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号