首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Variation in GC content, GC skew and AT skew along genomic regions was examined at third codon positions in completely sequenced prokaryotes. Eight out of nine eubacteria studied show GC and AT skews that change sign at the origin of replication. The leading strand in DNA replication is G-T rich at codon position 3 in six eubacteria, but C-T rich in two Mycoplasma species. In M. genitalium the AT and GC skews are symmetrical around the origin and terminus of replication, whereas its GC content variation has been shown to have a centre of symmetry elsewhere in the genome. Borrelia burgdorferi and Treponema pallidum show extraordinary extents of base composition skew correlated with direction of DNA replication. Base composition skews measured at third codon positions probably reflect mutational biases, whereas those measured over all bases in a sequence (or at codon positions 1 and 2) can be strongly affected by protein considerations due to the tendency in some bacteria for genes to be transcribed in the same direction that they are replicated. Consequently in some species the direction of skew for total genomic DNA is opposite to that for codon position 3. Received: 2 February 1998 / Accepted: 15 June 1998  相似文献   

2.
Biased codon usage is common in eukaryotic and prokaryotic genes. Evidence from Escherichia, Saccharomyces, and Drosophila indicates that it favors translational efficiency and accuracy. However, to date no functional advantages have been identified in the codon–anticodon interactions involving the most frequently used (preferred) codons. Here we present evidence that forces not related to the individual codon–anticodon interaction may be involved in determining which synonymous codons are preferred or avoided. We show that the ``off-frame' trinucleotide motif preferences inferrable from Drosophila coding regions are often in the same direction as Drosophila's ``in-frame' codon preferences, i.e., its codon usage. The off-frame preferences were inferred from the nonrandomness of the location of confamilial synonymous codons along coding regions—a pattern often described as a context dependence of nucleotide choice at synonymous positions or as codon-pair bias. We relied on randomizations of the location of confamilial codons that do not alter, and cannot be influenced by, the encoded amino acid sequences, codon usage, or base composition of the genes examined. The statistically significant congruency of in-frame and off-frame trinucleotide preferences suggests that the same kind of reading-frame-independent force(s) may also influence synonymous codon choice. These forces may have produced biases in codon usage that then led to the evolution of the translational advantages of these motifs as preferred codons. Under this scenario, tRNA pool size differences between preferred and nonpreferred codons initially were evolved to track the default overrepresentation of codons with preferred motifs. The motif preference hypothesis can explain the structuring of codon preferences and the similarities in the codon usages of distantly related organisms. Received: 10 November 1998 / Accepted: 23 February 1999  相似文献   

3.
In many unicellular organisms, invertebrates, and plants, synonymous codon usage biases result from a coadaptation between codon usage and tRNAs abundance to optimize the efficiency of protein synthesis. However, it remains unclear whether natural selection acts at the level of the speed or the accuracy of mRNAs translation. Here we show that codon usage can improve the fidelity of protein synthesis in multicellular species. As predicted by the model of selection for translational accuracy, we find that the frequency of codons optimal for translation is significantly higher at codons encoding for conserved amino acids than at codons encoding for nonconserved amino acids in 548 genes compared between Caenorhabditis elegans and Homo sapiens. Although this model predicts that codon bias correlates positively with gene length, a negative correlation between codon bias and gene length has been observed in eukaryotes. This suggests that selection for fidelity of protein synthesis is not the main factor responsible for codon biases. The relationship between codon bias and gene length remains unexplained. Exploring the differences in gene expression process in eukaryotes and prokaryotes should provide new insights to understand this key question of codon usage. Received: 18 June 2000 / Accepted: 10 November 2000  相似文献   

4.
Synonymous codon choices vary considerably among Schistosoma mansoni genes. Principal components analysis detects a single major trend among genes, which highly correlates with GC content in third codon positions and exons, but does not discriminate among putatively highly and lowly expressed genes. The effective number of codons used in each gene, and its distribution when plotted against GC3, suggests that codon usage is shaped mainly by mutational biases. The GC content of exons, GC3, 5′, 3′, and flanking (5′+ 3′+ introns) regions are all correlated among them, suggesting that variations in GC content may exist among different regions of the S. mansoni genome. We propose that this genome structure might be among the most important factors shaping codon usage in this species, although the action of selection on certain sequences cannot be excluded. Received: 10 March 1997 / Accepted: 27 June 1997  相似文献   

5.
6.
Phylogenetic analyses frequently rely on models of sequence evolution that detail nucleotide substitution rates, nucleotide frequencies, and site-to-site rate heterogeneity. These models can influence hypothesis testing and can affect the accuracy of phylogenetic inferences. Maximum likelihood methods of simultaneously constructing phylogenetic tree topologies and estimating model parameters are computationally intensive, and are not feasible for sample sizes of 25 or greater using personal computers. Techniques that initially construct a tree topology and then use this non-maximized topology to estimate ML substitution rates, however, can quickly arrive at a model of sequence evolution. The accuracy of this two-step estimation technique was tested using simulated data sets with known model parameters. The results showed that for a star-like topology, as is often seen in human immunodeficiency virus type 1 (HIV-1) subtype B sequences, a random starting topology could produce nucleotide substitution rates that were not statistically different than the true rates. Samples were isolated from 100 HIV-1 subtype B infected individuals from the United States and a 620 nt region of the env gene was sequenced for each sample. The sequence data were used to obtain a substitution model of sequence evolution specific for HIV-1 subtype B env by estimating nucleotide substitution rates and the site-to-site heterogeneity in 100 individuals from the United States. The method of estimating the model should provide users of large data sets with a way to quickly compute a model of sequence evolution, while the nucleotide substitution model we identified should prove useful in the phylogenetic analysis of HIV-1 subtype B env sequences. Received: 4 October 2000 / Accepted: 1 March 2001  相似文献   

7.
We examined a region of high variability in the mosaic mercury resistance (mer) operon of natural bacterial isolates from the primate intestinal microbiota. The region between the merP and merA genes of nine mer loci was sequenced and either the merC, the merF, or no gene was present. Two novel merC genes were identified. Overall nucleotide diversity, π (per 100 sites), of the merC gene was greater (49.63) than adjacent merP (35.82) and merA (32.58) genes. However, the consequences of this variability for the predicted structure of the MerC protein are limited and putative functional elements (metal-binding ligands and transmembrane domains) are strongly conserved. Comparison of codon usage of the merTP, merC, and merA genes suggests that several merC genes are not coeval with their flanking sequences. Although evidence of homologous recombination within the very variable merC genes is not apparent, the flanking regions have higher homologies than merC, and recombination appears to be driving their overall sequence identities higher. The synonymous codon usage bias (ENC) values suggest greater variability in expression of the merC gene than in flanking genes in six different bacterial hosts. We propose a model for the evolution of MerC as a host-dependent, adventitious module of the mer operon. Received: 2 June 2000 / Accepted: 23 October 2000  相似文献   

8.
Synonymous codon usage in related species may differ as a result of variation in mutation biases, differences in the overall strength and efficiency of selection, and shifts in codon preference—the selective hierarchy of codons within and between amino acids. We have developed a maximum-likelihood method to employ explicit population genetic models to analyze the evolution of parameters determining codon usage. The method is applied to twofold degenerate amino acids in 50 orthologous genes from D. melanogaster and D. virilis. We find that D. virilis has significantly reduced selection on codon usage for all amino acids, but the data are incompatible with a simple model in which there is a single difference in the long-term N e, or overall strength of selection, between the two species, indicating shifts in codon preference. The strength of selection acting on codon usage in D. melanogaster is estimated to be |N e s|≈ 0.4 for most CT-ending twofold degenerate amino acids, but 1.7 times greater for cysteine and 1.4 times greater for AG-ending codons. In D. virilis, the strength of selection acting on codon usage for most amino acids is only half that acting in D. melanogaster but is considerably greater than half for cysteine, perhaps indicating the dual selection pressures of translational efficiency and accuracy. Selection coefficients in orthologues are highly correlated (ρ= 0.46), but a number of genes deviate significantly from this relationship. Received: 20 December 1998 / Accepted: 17 February 1999  相似文献   

9.
To characterize the coding-sequence divergence of closely related genomes, we compared DNA sequence divergence between sequences from a Brassica rapa ssp. pekinensis EST library isolated from flower buds and genomic sequences from Arabidopsis thaliana. The specific objectives were (i) to determine the distribution of and relationship between K a and K s, (ii) to identify genes with the lowest and highest K a:K s values, and (iii) to evaluate how codon usage has diverged between two closely related species. We found that the distribution of K a:K s was unimodal, and that substitution rates were more variable at nonsynonymous than synonymous sites, and detected no evidence that K a and K s were positively correlated. Several genes had K a:K s values equal to or near zero, as expected for genes that have evolved under strong selective constraint. In contrast, there were no genes with K a:K s >1 and thus we found no strong evidence that any of the 218 sequences we analyzed have evolved in response to positive selection. We detected a stronger codon bias but a lower frequency of GC at synonymous sites in A. thaliana than B. rapa. Moreover, there has been a shift in the profile of most commonly used synonymous codons since these two species diverged from one another. This shift in codon usage may have been caused by stronger selection acting on codon usage or by a shift in the direction of mutational bias in the B. rapa phylogenetic lineage.  相似文献   

10.
Complete sequences of seven protein coding genes from Penaeus notialis mitochondrial DNA were compared in base composition and codon usage with homologous genes from Artemia franciscana and four insects. The crustacean genes are significantly less A + T-rich than their counterpart in insects and the pattern of codon usage (ratio of G + C-rich versus A + T-rich codon) is less biased. A phylogenetic analysis using amino acid sequences of the seven corresponding polypeptides supports a sister-taxon status for mollusks–annelid and arthropods. Furthermore, a distance matrix-based tree and two most-parsimonious trees both suggest that crustaceans are paraphyletic with respect to insects. This is also supported by the inclusion of Panulirus argus COII (complete) and COI and COIII (partial) sequence data. From analysis of single and combined genes to infer phylogenies, it is observed that obtained from single genes are not well supported in most topologies cases and notably differ from that of the tree based on all seven genes. Received: 25 August 1998 / Accepted: 8 March 1999  相似文献   

11.
Mycobacterium tuberculosis and Mycobacterium leprae are the ethiological agents of tuberculosis and leprosy, respectively. After performing extensive comparisons between genes from these two GC-rich bacterial species, we were able to construct a set of 275 homologous genes. Since these two bacterial species also have a very low growth rate, translational selection could not be so determinant in their codon preferences as it is in other fast-growing bacteria. Indeed, principal-components analysis of codon usage from this set of homologous genes revealed that the codon choices in M. tuberculosis and M. leprae are correlated not only with compositional constraints and translational selection, but also with the degree of amino acid conservation and the hydrophobicity of the encoded proteins. Finally, significant correlations were found between GC3 and synonymous distances as well as between synonymous and nonsynonymous distances. Received: 30 October 1998 / Accepted: 16 August 1999  相似文献   

12.
Southern hybridization data suggest that the male sex-determining locus, Sry, is often duplicated in rodents. Here we explore DNA sequence evolution of orthologous and paralogous copies of Sry isolated from six species of African murines. PCR amplification followed by direct sequencing revealed from two to four copies of Sry per species. All copies include a long open reading frame, with a stop codon that coincides closely with the stop codon of the house mouse, Mus musculus, a species known to have a single copy of Sry. A phylogenetic analysis suggests that there are at least seven paralogous copies of Sry in this group of rodents. Putative orthologues are identical; sequence divergence among putative paralogues ranges from 1 to 8% (excluding the CAG repeat), with much lower levels of divergence in the high-mobility group (HMG-box) region than in the C-terminal region. A high proportion of nucleotide substitutions in both regions result in amino-acid replacement. The long open reading frame, conserved HMG-box, and pattern of evolution of the putative paralogues suggest that they are functional. Received: 4 October 1996 / Accepted: 17 January 1997  相似文献   

13.
In bacteria, synonymous codon usage can be considerably affected by base composition at neighboring sites. Such context-dependent biases may be caused by either selection against specific nucleotide motifs or context-dependent mutation biases. Here we consider the evolutionary conservation of context-dependent codon bias across 11 completely sequenced bacterial genomes. In particular, we focus on two contextual biases previously identified in Escherichia coli; the avoidance of out-of-frame stop codons and AGG motifs. By identifying homologues of E. coli genes, we also investigate the effect of gene expression level in Haemophilus influenzae and Mycoplasma genitalium. We find that while context-dependent codon biases are widespread in bacteria, few are conserved across all species considered. Avoidance of out-of-frame stop codons does not apply to all stop codons or amino acids in E. coli, does not hold for different species, does not increase with gene expression level, and is not relaxed in Mycoplasma spp., in which the canonical stop codon, TGA, is recognized as tryptophan. Avoidance of AGG motifs shows some evolutionary conservation and increases with gene expression level in E. coli, suggestive of the action of selection, but the cause of the bias differs between species. These results demonstrate that strong context-dependent forces, both selective and mutational, operate on synonymous codon usage but that these differ considerably between genomes. Received: 6 May 1999 / Accepted: 29 October 1999  相似文献   

14.
The protein sequence of ATP/CTP:tRNA nucleotidyltransferase (cca) from Sulfolobus shibatae was used to search open reading frames in the genome of Methanococcus jannaschii. Translations of two unidentified open reading frames showed significant sequence similarity to portions of the Sulfolobus cca protein. When the two open reading frames were joined together, the expanded open reading frame was similar in sequence to the entire Sulfolobus cca protein and displayed features of the active site signature sequence proposed for members of class I enzymes within the superfamily of nucleotidyltransferases (Yue et al., 1996, RNA 2, 895–908). A possible UUG start codon was identified based on significant sequence similarity of the resulting amino-terminal region to that of Sulfolobus, and on a six-base complementarity between an adjacent upstream sequence and Methanococcus 16S rRNA. Received: 10 February 1997  相似文献   

15.
In the plant chloroplast genome the codon usage of the highly expressed psbA gene is unique and is adapted to the tRNA population, probably due to selection for translation efficiency. In this study the role of selection on codon usage in each of the fully sequenced chloroplast genomes, in addition to Chlamydomonas reinhardtii, is investigated by measuring adaptation to this pattern of codon usage. A method is developed which tests selection on each gene individually by constructing sequences with the same amino acid composition as the gene and randomly assigning codons based on the nucleotide composition of noncoding regions of that genome. The codon bias of the actual gene is then compared to a distribution of random sequences. The data indicate that within the algae selection is strong in Cyanophora paradoxa, affecting a majority of genes, of intermediate intensity in Odontella sinensis, and weaker in Porphyra purpurea and Euglena gracilis. In the plants, selection is found to be quite weak in Pinus thunbergii and the angiosperms but there is evidence that an intermediate level of selection exists in the liverwort Marchantia polymorpha. The role of selection is then further investigated in two comparative studies. It is shown that average relative codon bias is correlated with expression level and that, despite saturation levels of substitution, there is a strong correlation among the algae genomes in the degree of codon bias of homologous genes. All of these data indicate that selection for translation efficiency plays a significant role in determining the codon bias of chloroplast genes but that it acts with different intensities in different lineages. In general it is stronger in the algae than the higher plants, but within the algae Euglena is found to have several unusual features which are noted. The factors that might be responsible for this variation in intensity among the various genomes are discussed. Received: 6 June 1997 / Accepted: 24 July 1997  相似文献   

16.
We compared the codon usage of sequences of transposable elements (TEs) with that of host genes from the species Drosophila melanogaster, Arabidopsis thaliana, Caenorhabditis elegans, Saccharomyces cerevisiae, and Homo sapiens. Factorial correspondence analysis showed that, regardless of the base composition of the genome, the TEs differed from the genes of their host species by their AT-richness. In all species, the percentage of A + T on the third codon position of the TEs was higher than that on the first codon position and lower than that in the noncoding DNA of the genomes. This indicates that the codon choice is not simply the outcome of mutational bias but is also subject to selection constraints. A tendency toward higher A + T on the third position than on the first position was also found in the host genes of A. thaliana, C. elegans, and S. cerevisiae but not in those of D. melanogaster and H. sapiens. This strongly suggests that the AT choice is a host-independent characteristic common to all TEs. The codon usage of TEs generally appeared to be different from the mean of the host genes. In the AT-rich genomes of Arabidopsis thaliana, Caenorhabditis elegans, and Saccharomyces cerevisiae, the codon usage bias of TEs was similar to that of weakly expressed genes. In the GC-rich genome of D. melanogaster, however, the bias in codon usage of the TEs clearly differed from that of weakly expressed genes. These findings suggest that selection acts on TEs and that TEs may display specific behavior within the host genomes. Received: 2 May 2001 / Accepted: 29 October 2001  相似文献   

17.
In this study, we analyzed the correlation between codon usage bias and Shine–Dalgarno (SD) sequence conservation, using complete genome sequences of nine prokaryotes. For codon usage bias, we adopted the codon adaptation index (CAI), which is based on the codon usage preference of genes encoding ribosomal proteins, elongation factors, heat shock proteins, outer membrane proteins, and RNA polymerase subunit proteins. To compute SD sequence conservation, we used SD motif sequences predicted by Tompa and systematically aligned them with 5′UTR sequences. We found that there exists a clear correlation between the CAI values and SD sequence conservation in the genomes of Escherichia coli, Bacillus subtilis, Haemophilus influenzae, Archaeoglobus fulgidus, Methanobacterium thermoautotrophicum, and Methanococcus jannaschii, and no relationship is found in M. genitalium, M. pneumoniae, and Synechocystis. That is, genes with higher CAI values tend to have more conserved SD sequences than do genes with lower CAI values in these organisms. Some organisms, such as M. thermoautotrophicum, do not clearly show the correlation. The biological significance of these results is discussed in the context of the translation initiation process and translation efficiency. Received: 22 June 2000 / Accepted: 18 October 2000  相似文献   

18.
Tandemly duplicated actin genes have been isolated from a Helicoverpa armigera genomic library. Sequence comparisons with actin genes from other species suggest they encode cytoplasmic actins, being most closely related to the Bombyx mori A3 actin gene. The duplicated H. armigera actin genes, termed A3a and A3b, share 98.3% nucleotide sequence identity over their entire putative coding region. Analysis of the distribution of nucleotide differences shows the first 763 bp are identical between the two coding regions, with the 18 nucleotide changes occurring in the remaining 366 bp. This observation suggests a gene conversion event has taken place between the duplicated H. armigera A3a and A3b actin genes. Translation of the open-reading frames indicates the products of these genes are identical, apart from a single amino acid difference at codon 273. Polymerase chain reaction and northern blot analysis have shown both H. armigera A3a and A3b genes are expressed during pupal development and in the brain of newly eclosed adults. A region 5′ of the H. armigera A3a actin gene start codon has been identified which contains regulatory sequences commonly found in the promoter region of actin genes, including TATA, CAAT, and CArG motifs. Received: 10 January 1996 / Accepted: 12 March 1996  相似文献   

19.
Along the gene, nucleotides in various codon positions tend to exert a slight but observable influence on the nucleotide choice at neighboring positions. Such context biases are different in different organisms and can be used as genomic signatures. In this paper, we will focus specifically on the dinucleotide composed of a third codon position nucleotide and its succeeding first position nucleotide. Using the 16 possible dinucleotide combinations, we calculate how well individual genes conform to the observed mean dinucleotide frequencies of an entire genome, forming a distance measure for each gene. It is found that genes from different genomes can be separated with a high degree of accuracy, according to these distance values. In particular, we address the problem of recent horizontal gene transfer, and how imported genes may be evaluated by their poor assimilation to the host's context biases. By concentrating on the third- and succeeding first position nucleotides, we eliminate most spurious contributions from codon usage and amino-acid requirements, focusing mainly on mutational effects. Since imported genes are expected to converge only gradually to genomic signatures, it is possible to question whether a gene present in only one of two closely related organisms has been imported into one organism or deleted in the other. Striking correlations between the proposed distance measure and poor homology are observed when Escherichia coli genes are compared to Salmonella typhi, indicating that sets of outlier genes in E. coli may contain a high number of genes that have been imported into E. coli, and not deleted in S. typhi. Received: 16 January 2001 / Accepted: 30 August 2001  相似文献   

20.
Detailed nucleotide diversity studies revealed that the fil1 gene of Antirrhinum, which has been reported to be single copy, is a member of a gene family composed of at least five genes. In four Antirrhinum majus populations with different mating systems and one A. graniticum population, diversity within populations is very low. Divergence among Antirrhinum species and between Antirrhinum and Digitalis is also low. For three of these genes we also obtained sequences from a more divergent member of the Scrophulariaceae, Verbascum nigrum. Compared with Antirrhinum, little divergence is again observed. These results, together with similar data obtained previously for five cycloidea genes, suggest either that these gene families (or the Antirrhinum genome) are unusually constrained or that there is a low rate of substitution in these lineages. Using a sample of 52 genes, based on two measures of codon usage (ENC and GC3 content), we show that cyc and fil1 are among the least biased Antirrhinum genes, so that their low diversity is not due to extreme codon bias. Received: 20 June 2000 / Accepted: 25 October 2000  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号