首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
2.
Biased codon usage is common in eukaryotic and prokaryotic genes. Evidence from Escherichia, Saccharomyces, and Drosophila indicates that it favors translational efficiency and accuracy. However, to date no functional advantages have been identified in the codon–anticodon interactions involving the most frequently used (preferred) codons. Here we present evidence that forces not related to the individual codon–anticodon interaction may be involved in determining which synonymous codons are preferred or avoided. We show that the ``off-frame' trinucleotide motif preferences inferrable from Drosophila coding regions are often in the same direction as Drosophila's ``in-frame' codon preferences, i.e., its codon usage. The off-frame preferences were inferred from the nonrandomness of the location of confamilial synonymous codons along coding regions—a pattern often described as a context dependence of nucleotide choice at synonymous positions or as codon-pair bias. We relied on randomizations of the location of confamilial codons that do not alter, and cannot be influenced by, the encoded amino acid sequences, codon usage, or base composition of the genes examined. The statistically significant congruency of in-frame and off-frame trinucleotide preferences suggests that the same kind of reading-frame-independent force(s) may also influence synonymous codon choice. These forces may have produced biases in codon usage that then led to the evolution of the translational advantages of these motifs as preferred codons. Under this scenario, tRNA pool size differences between preferred and nonpreferred codons initially were evolved to track the default overrepresentation of codons with preferred motifs. The motif preference hypothesis can explain the structuring of codon preferences and the similarities in the codon usages of distantly related organisms. Received: 10 November 1998 / Accepted: 23 February 1999  相似文献   

3.
4.
Insertions and deletions of entire codons have recently been discovered as a mechanism by which B cells, in addition to conventional base substitution, evolve the antibodies produced by their immunoglobulin genes. These events frequently seem to involve repetitive sequence motifs in the antibody-encoding genes, and it has been suggested that they occur through polymerase slippage. In order to better understand the process of codon deletion, we have analyzed the human immunoglobulin heavy variable (IGHV) germline gene repertoire for the presence of trinucleotide repeats. Such repeats would ensure that the reading frame is maintained in the case of a deletional event, as slippage over multiples of three bases would be favored. We demonstrate here that IGHV genes specifically carry repetitive trinucleotide motifs in the complementarity-determining regions (CDR) 1 and 2, thus making these parts of the genes that encode highly flexible structures particularly prone to functional deletions. We propose that the human IGHV repertoire carries inherent motifs that allow an antibody response to develop efficiently by targeting codon deletion events to the parts of the molecule that are likely to be able to harbor such modifications. Received: 10 April 2001 / Accepted: 27 August 2001  相似文献   

5.
H Li  J Liu  K Wu  Y Chen 《PloS one》2012,7(7):e41167
Glutamine tandem repeats are common in eukaryotic proteins. Although some studies have proposed that replication slippage plays an important role in shaping these repeats, the role of natural selection in glutamine tandem repeat evolution is somewhat unclear. In this study, we identified all of the glutamine tandem repeats containing four or more glutamines in human proteins and then estimated the nonsynonymous (d(N)) and synonymous (d(S)) substitution rates for the regions flanking the glutamine tandem repeats and the proteins containing them. The results indicated that most of the proteins containing polyglutamine (polyQ) tracts of four or more glutamines have undergone purifying selection, and that the purifying selection for the regions flanking the repeats is weaker. Additionally, we observed that the conserved repeats were under stronger selection constraints than the nonconserved repeats. Interestingly, we found that there was a higher level of purifying selection for the regions flanking the polyQ tracts encoded by pure CAG codons compared with those encoded by mixed codons. Based on our findings, we propose that selection has played a more important role than was previously speculated in constraining the expansion of polyQ tracts encoded by pure codons.  相似文献   

6.
Evolution of proteins encoded in nucleotide sequences began with the advent of the triplet code. The chronological order of the appearance of amino acids on the evolution scene and the steps in the evolution of the triplet code have been recently reconstructed (Trifonov, 2000b) on the basis of 40 different ranking criteria and hypotheses. According to the consensus chronology, the pair of complementary GGC and GCC codons for the amino acids alanine and glycine appeared first. Other codons appeared as complementary pairs as well, which divided their respective amino acids into two alphabets, encoded by triplets with either central purines or central pyrimidines: G, D, S, E, N, R, K, Q, C, H, Y, and W (Glycine alphabet G) and A, V, P, S, L, T, I, F, and M (Alanine alphabet A). It is speculated that the earliest polypeptide chains were very short, presumably of uniform length, belonging to two alphabet types encoded in the two complementary strands of the earliest mRNA duplexes. After the fusion of the minigenes, a mosaic of the alphabets would form. Traces of the predicted mosaic structure have been, indeed, detected in the protein sequences of complete prokaryotic genomes in the form of weak oscillations with the period 12 residues in the form of alteration of two types of 6 residue long units. The next stage of protein evolution corresponded to the closure of the chains in the loops of the size 25–30 residues (Berezovsky et al., 2000). Autocorrelation analysis of proteins of 23 complete archaebacterial and eubacterial genomes revealed that the preferred distances between valine, alanine, glycine, leucine, and isoleucine along the sequences are in the same range of 25–30 residues, indicating that the loops are primarily closed by hydrophobic interactions between the ends of the loops. The loop closure stage is followed by the formation of typical folds of 100–200 amino acids, via end-to-end fusion of the genes encoding the loop-size chains. This size was apparently dictated by the optimal ring closure for DNA. In both cases the closure into the ring (loop) rendered evolutionarily advantageous stability to the respective structures. Further gene fusions lead to the formation of modern multidomain proteins. Recombinational gene splicing is likely to have appeared after the DNA circularization stage. Received: 21 December 2000 / Accepted: 28 February 2001  相似文献   

7.
Circular permutations of genes during molecular evolution often are regarded as elusive, although a simple model can explain these rearrangements. The model assumes that first a gene duplication of the precursor gene occurs in such a way that both genes become fused in frame, leading to a tandem protein. After generation of a new start codon within the 5′ part of the tandem gene and a stop at an equivalent position in the 3′ part of the gene, a protein is encoded that represents a perfect circular permutation of the precursor gene product. The model is illustrated here by the molecular evolution of adenine-N6 DNA methyltransferases. β- and γ-type enzymes of this family can be interconverted by a single circular permutation event. Interestingly, tandem proteins, proposed as evolutionary intermediates during circular permutation, can be directly observed in the case of adenine methyltransferases, because some enzymes belonging to type IIS, like the FokI methyltransferase, are built up by two fused enzymes, both of which are active independently of each other. The mechanism for circular permutation illustrated here is very easy and applicable to every protein. Thus, circular permutation can be regarded as a normal process in molecular evolution and a changed order of conserved amino acid motifs should not be interpreted to argue against divergent evolution. Received: 17 November 1998 / Accepted: 19 February 1999  相似文献   

8.
9.
In many unicellular organisms, invertebrates, and plants, synonymous codon usage biases result from a coadaptation between codon usage and tRNAs abundance to optimize the efficiency of protein synthesis. However, it remains unclear whether natural selection acts at the level of the speed or the accuracy of mRNAs translation. Here we show that codon usage can improve the fidelity of protein synthesis in multicellular species. As predicted by the model of selection for translational accuracy, we find that the frequency of codons optimal for translation is significantly higher at codons encoding for conserved amino acids than at codons encoding for nonconserved amino acids in 548 genes compared between Caenorhabditis elegans and Homo sapiens. Although this model predicts that codon bias correlates positively with gene length, a negative correlation between codon bias and gene length has been observed in eukaryotes. This suggests that selection for fidelity of protein synthesis is not the main factor responsible for codon biases. The relationship between codon bias and gene length remains unexplained. Exploring the differences in gene expression process in eukaryotes and prokaryotes should provide new insights to understand this key question of codon usage. Received: 18 June 2000 / Accepted: 10 November 2000  相似文献   

10.
A DNA fragment containing short tandem repeat sequences (approximately 86-bp repeat) was isolated from a Xenopus laevis cDNA library. Southern blot and in situ hybridization analyses revealed that the repeat was highly dispersed in the genome and was present at approximately 1 million copies per haploid genome. We named this element Xstir (Xenopus short tandemly and invertedly repeating element) after its arrangement in the genome. The majority of the genomic Xstir sequences were digested to monomer and dimer sizes with several restriction enzymes. Their sequences were found to be highly homogeneous and organized into tandem arrays in the genome. Alignment analyses of several known sequences showed that some of the Xstir-like sequences were also organized into interspersed inverted repeats. The inverted repeats consisted of an inverted pair of two differently modified Xstirs separated by a short insert. In addition, these were framed by another novel inverted repeat (Xstir-TIR). The Xstir-TIR sequence was also found at the ends of tandem Xstir arrays. Furthermore, we found that Xstir-TIR was linked to a motif characterizing the T2 family which belonged to a vertebrate MITE (miniature inverted-repeat transposable element) family, suggesting the importance of Xstir-TIR for their amplification and transposition. The present study of 11 anuran and 2 urodele species revealed that Xstir or Xstir-like sequences were extensively amplified in the three Xenopus species. Genomic Xstir populations of X. borealis and X. laevis were mutually indistinguishable but significantly different from that of X. tropicalis. Received: 5 April 2000 / Accepted: 3 August 2000  相似文献   

11.
Tandemly repeated sequences are a major component of the eukaryotic genome. Although the general characteristics of tandem repeats have been well documented, the processes involved in their origin and maintenance remain unknown. In this study, a region on the paternal sex ratio (PSR) chromosome was analyzed to investigate the mechanisms of tandem repeat evolution. The region contains a junction between a tandem array of PSR2 repeats and a copy of the retrotransposon NATE, with other dispersed repeats (putative mobile elements) on the other side of the element. Little similarity was detected between the sequence of PSR2 and the region of NATE flanking the array, indicating that the PSR2 repeat did not originate from the underlying NATE sequence. However, a short region of sequence similarity (11/15 bp) and an inverted region of sequence identity (8 bp) are present on either side of the junction. These short sequences may have facilitated nonhomologous recombination between NATE and PSR2, resulting in the formation of the junction. Adjacent to the junction, the three most terminal repeats in the PSR2 array exhibited a higher sequence divergence relative to internal repeats, which is consistent with a theoretical prediction of the unequal exchange model for tandem repeat evolution. Other NATE insertion sites were characterized which show proximity to both tandem repeats and complex DNAs containing additional dispersed repeats. An ``accretion model' is proposed to account for this association by the accumulation of mobile elements at the ends of tandem arrays and into ``islands' within arrays. Mobile elements inserting into arrays will tend to migrate into islands and to array ends, due to the turnover in the number of intervening repeats. Received: 18 August 1997 / Accepted: 18 September 1998  相似文献   

12.
In this study, we analyzed the correlation between codon usage bias and Shine–Dalgarno (SD) sequence conservation, using complete genome sequences of nine prokaryotes. For codon usage bias, we adopted the codon adaptation index (CAI), which is based on the codon usage preference of genes encoding ribosomal proteins, elongation factors, heat shock proteins, outer membrane proteins, and RNA polymerase subunit proteins. To compute SD sequence conservation, we used SD motif sequences predicted by Tompa and systematically aligned them with 5′UTR sequences. We found that there exists a clear correlation between the CAI values and SD sequence conservation in the genomes of Escherichia coli, Bacillus subtilis, Haemophilus influenzae, Archaeoglobus fulgidus, Methanobacterium thermoautotrophicum, and Methanococcus jannaschii, and no relationship is found in M. genitalium, M. pneumoniae, and Synechocystis. That is, genes with higher CAI values tend to have more conserved SD sequences than do genes with lower CAI values in these organisms. Some organisms, such as M. thermoautotrophicum, do not clearly show the correlation. The biological significance of these results is discussed in the context of the translation initiation process and translation efficiency. Received: 22 June 2000 / Accepted: 18 October 2000  相似文献   

13.
Endosymbiotic bacteria live in animal cells and are transmitted vertically at the time of the host's reproduction. In view of their small and asexual populations with infrequent chances of recombination, these endocellular bacteria are expected to accumulate mildly deleterious mutations. Previous studies showed that the DNA sequences of these bacteria evolved faster than those of free-living bacteria. In this study, we compared all the ORFs of Buchnera, an endocellular bacterial symbiont of aphids, with those of 34 other prokaryotic organisms and estimated the effect of the accelerated evolution of Buchnera on the functions of its proteins. It was revealed that Buchnera proteins contain many mutations at the sites where sequences are conserved in their orthologues in many other organisms. In addition, amino acid replacements at the conserved sites are mostly changes to physicochemically different amino acids. These results suggest that functions and conformations of Buchnera proteins have been seriously impaired or strongly modified. Indeed, extensive loss of functional motifs was observed in some Buchnera proteins. In many Buchnera proteins mutations were not detected evenly throughout each molecule but tended to accumulate in some functional units, possibly leading to loss of specific functions. As Buchnera has an unusual and limited gene repertory, it is conceivable that the manner of interactions among its proteins has been changed, and thus, functional constraints over their amino acid residues have also been changed during evolution. This may account for the loss of some functional units only in the Buchnera proteins. We obtained evidence that amino acid replacements in Buchnera were not always deleterious, but neutral or, in some cases, even positively selected. Received: 14 December 2000 / Accepted: 12 March 2001  相似文献   

14.
Wolbachia are obligatory intracellular and maternally inherited bacteria, known to infect many species of arthropod. In this study, we discovered a bacteriophage-like genetic element in Wolbachia, which was tentatively named bacteriophage WO. The phylogenetic tree based on phage WO genes of several Wolbachia strains was not congruent with that based on chromosomal genes of the same strains, suggesting that phage WO was active and horizontally transmitted among various Wolbachia strains. All the strains of Wolbachia used in this study were infected with phage WO. Although the phage genome contained genes of diverse origins, the average G+C content and codon usage of these genes were quite similar to those of a chromosomal gene of Wolbachia. These results raised the possibility that phage WO has been associated with Wolbachia for a very long time, conferring some benefit to its hosts. The evolution and possible roles of phage WO in various reproductive alterations of insects caused by Wolbachia are discussed. Received: 28 January 2000 / Accepted: 3 August 2000  相似文献   

15.
Mycobacterium tuberculosis and Mycobacterium leprae are the ethiological agents of tuberculosis and leprosy, respectively. After performing extensive comparisons between genes from these two GC-rich bacterial species, we were able to construct a set of 275 homologous genes. Since these two bacterial species also have a very low growth rate, translational selection could not be so determinant in their codon preferences as it is in other fast-growing bacteria. Indeed, principal-components analysis of codon usage from this set of homologous genes revealed that the codon choices in M. tuberculosis and M. leprae are correlated not only with compositional constraints and translational selection, but also with the degree of amino acid conservation and the hydrophobicity of the encoded proteins. Finally, significant correlations were found between GC3 and synonymous distances as well as between synonymous and nonsynonymous distances. Received: 30 October 1998 / Accepted: 16 August 1999  相似文献   

16.
Telomeres of most insects are composed of simple (TTAGG) n repeats that are synthesized by telomerase. However, in some dipteran insects such as Drosophila melanogaster, (TTAGG) n repeats or telomerase activity has not been detected. Although telomere structure is well documented in Diptera and Lepidoptera, very limited information is available on lower insect groups. To understand general aspects of telomere function and evolution in insects, we endeavored to characterize structures of the telomeric and subtelomeric regions in a lower insect, the Taiwan cricket, Teleogryllus taiwanemma. FISH analysis of this insect's chromosomes demonstrated (TTAGG) n repeat elements in all distal ends. Just proximal to the telomeric repeats, the highly conserved 9-kb long terminal unit (LTU) sequences are tandemly repeated. These were observed in four of six chromosomes, three autosomal ends, and one X-chromosomal end. LTU sequences represent about 0.2% of the T. taiwanemma genome. Each LTU contains a core (TTAGG)8-like sequence (TRLS) and five types of conserved sequences—ST (short telomere associated), J (joint), X, SR (satellite sequence rich), and Y—which vary in length from about 150 bp to 2.7 kb. The LTU sequence is defined as ST–J–TRLS–SR–X–Y–X–Y–X. Most LTU regions may be derived from the ancestral common sequence, which is observed in ST regions six times and at many other LTU sites. We could not find the LTU-like sequence in three other crickets including the closest species, T. emma, suggesting that the LTU in T. taiwanemma has been rapidly amplified in subtelomeric regions through recent evolutional events. It is also suggested that the highly conserved structure of the LTU is maintained by recombination and may contribute to telomere elongation, as seen in dipteran insects. Received: 6 August 2001/Accepted: 10 October 2001  相似文献   

17.
The extent to which base composition and codon usage vary among RNA viruses, and the possible causes of this bias, is undetermined in most cases. A maximum-likelihood statistical method was used to test whether base composition and codon usage bias covary with arthropod association in the genus Flavivirus, a major source of disease in humans and animals. Flaviviruses are transmitted by mosquitoes, by ticks, or directly between vertebrate hosts. Those viruses associated with ticks were found to have a significantly lower G+C content than non-vector-borne flaviviruses and this difference was present throughout the genome at all amino acids and codon positions. In contrast, mosquito-borne viruses had an intermediate G+C content which was not significantly different from those of the other two groups. In addition, biases in dinucleotide and codon usage that were independent of base composition were detected in all flaviviruses, but these did not covary with arthropod association. However, the overall effect of these biases was slight, suggesting only weak selection at synonymous sites. A preliminary analysis of base composition, codon usage, and vector specificity in other RNA virus families also revealed a possible association between base composition and vector specificity, although with biases different from those seen in the Flavivirus genus. Received: 29 August 2000 / Accepted: 19 December 2000  相似文献   

18.
Amino acid residues arginine (R) and lysine (K) have similar physicochemical characteristics and are often mutually substituted during evolution without affecting protein function. Statistical examinations on human proteins show that more R than K residues are used in the proximity of R residues, whereas more K than R are used near K residues. This biased use occurs on both a global and a local scale (shorter than ∼100 residues). Even within a given exon, G + C-rich and A + T-rich short DNA segments preferentially encode R and K, respectively. The biased use of R and K on a local scale is also seen in Saccharomyces cerevisiae and Caenorhabdidtis elegans, which lack global-scale mosaic structures with varying GC%, or isochores. Besides R and K, several amino acids are also used with a positive or negative correlation with the local GC% of third codon bases. The local-, or ``within-gene'-, scale heterogeneity of the DNA sequence may influence the sequence of the encoded protein segment. Received: 2 March 1998 / Accepted: 23 April 1998  相似文献   

19.
Mularoni L  Veitia RA  Albà MM 《Genomics》2007,89(3):316-325
Single-amino-acid tandem repeats are very common in mammalian proteins but their function and evolution are still poorly understood. Here we investigate how the variability and prevalence of amino acid repeats are related to the evolutionary constraints operating on the proteins. We find a significant positive correlation between repeat size difference and protein nonsynonymous substitution rate in human and mouse orthologous genes. This association is observed for all the common amino acid repeat types and indicates that rapid diversification of repeat structures, involving both trinucleotide slippage and nucleotide substitutions, preferentially occurs in proteins subject to low selective constraints. However, strikingly, we also observe a significant negative correlation between the number of repeats in a protein and the gene nonsynonymous substitution rate, particularly for glutamine, glycine, and alanine repeats. This implies that proteins subject to strong selective constraints tend to contain an unexpectedly high number of repeats, which tend to be well conserved between the two species. This is consistent with a role for selection in the maintenance of a significant number of repeats. Analysis of the codon structure of the sequences encoding the repeats shows that codon purity is associated with high repeat size interspecific variability. Interestingly, polyalanine and polyglutamine repeats associated with disease show very distinctive features regarding the degree of repeat conservation and the protein sequence selective constraints.  相似文献   

20.
Retrovirus-like sequences and their solitary (solo) long terminal repeats (LTRs) are common repetitive elements in eukaryotic genomes. We reported previously that the tandemly arrayed genes encoding U2 snRNA (the RNU2 locus) in humans and apes contain a solo LTR (U2-LTR) which was presumably generated by homologous recombination between the two LTRs of an ancestral provirus that is retained in the orthologous baboon RNU2 locus. We have now sequenced the orthologous U2-LTRs in human, chimpanzee, gorilla, orangutan, and baboon and examined numerous homologs of the U2-LTR that are dispersed throughout the human genome. Although these U2-LTR homologs have been collectively referred to as LTR13 in the literature, they do not display sequence similarity to any known retroviral LTRs; however, the structure of LTR13 closely resembles that of other retroviral LTRs with a putative promoter, polyadenylation signal, and a tandemly repeated 53-bp enhancer-like element. Genomic blotting indicates that LTR13 is primate-specific; based on sequence analysis, we estimate there are about 2,500 LTR13 elements in the human genome. Comparison of the primate U2-LTR sequences suggests that the homologous recombination event that gave rise to the solo U2-LTR occurred soon after insertion of the ancestral provirus into the ancestral U2 tandem array. Phylogenetic analysis of the LTR13 family confirms that it is diverse, but the orthologous U2-LTRs form a coherent group in which chimpanzee is closest to the humans; orangutan is a clear outgroup of human, chimpanzee, and gorilla; and baboon is a distant relative of human, chimpanzee, gorilla, and orangutan. We compare the LTR13 family with other known LTRs and consider whether these LTRs might play a role in concerted evolution of the primate RNU2 locus. Received: 29 September 1997 / Accepted: 16 January 1998  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号