共查询到20条相似文献,搜索用时 140 毫秒
1.
2.
The Nonrandom Location of Synonymous Codons Suggests That Reading Frame-Independent Forces Have Patterned Codon Preferences 总被引:6,自引:0,他引:6
Biased codon usage is common in eukaryotic and prokaryotic genes. Evidence from Escherichia, Saccharomyces, and Drosophila indicates that it favors translational efficiency and accuracy. However, to date no functional advantages have been identified
in the codon–anticodon interactions involving the most frequently used (preferred) codons. Here we present evidence that forces
not related to the individual codon–anticodon interaction may be involved in determining which synonymous codons are preferred
or avoided. We show that the ``off-frame' trinucleotide motif preferences inferrable from Drosophila coding regions are often in the same direction as Drosophila's ``in-frame' codon preferences, i.e., its codon usage. The off-frame preferences were inferred from the nonrandomness of
the location of confamilial synonymous codons along coding regions—a pattern often described as a context dependence of nucleotide
choice at synonymous positions or as codon-pair bias. We relied on randomizations of the location of confamilial codons that
do not alter, and cannot be influenced by, the encoded amino acid sequences, codon usage, or base composition of the genes
examined. The statistically significant congruency of in-frame and off-frame trinucleotide preferences suggests that the same
kind of reading-frame-independent force(s) may also influence synonymous codon choice. These forces may have produced biases
in codon usage that then led to the evolution of the translational advantages of these motifs as preferred codons. Under this
scenario, tRNA pool size differences between preferred and nonpreferred codons initially were evolved to track the default
overrepresentation of codons with preferred motifs. The motif preference hypothesis can explain the structuring of codon preferences
and the similarities in the codon usages of distantly related organisms.
Received: 10 November 1998 / Accepted: 23 February 1999 相似文献
3.
4.
Insertions and deletions of entire codons have recently been discovered as a mechanism by which B cells, in addition to conventional
base substitution, evolve the antibodies produced by their immunoglobulin genes. These events frequently seem to involve repetitive
sequence motifs in the antibody-encoding genes, and it has been suggested that they occur through polymerase slippage. In
order to better understand the process of codon deletion, we have analyzed the human immunoglobulin heavy variable (IGHV)
germline gene repertoire for the presence of trinucleotide repeats. Such repeats would ensure that the reading frame is maintained
in the case of a deletional event, as slippage over multiples of three bases would be favored. We demonstrate here that IGHV
genes specifically carry repetitive trinucleotide motifs in the complementarity-determining regions (CDR) 1 and 2, thus making
these parts of the genes that encode highly flexible structures particularly prone to functional deletions. We propose that
the human IGHV repertoire carries inherent motifs that allow an antibody response to develop efficiently by targeting codon
deletion events to the parts of the molecule that are likely to be able to harbor such modifications.
Received: 10 April 2001 / Accepted: 27 August 2001 相似文献
5.
Glutamine tandem repeats are common in eukaryotic proteins. Although some studies have proposed that replication slippage plays an important role in shaping these repeats, the role of natural selection in glutamine tandem repeat evolution is somewhat unclear. In this study, we identified all of the glutamine tandem repeats containing four or more glutamines in human proteins and then estimated the nonsynonymous (d(N)) and synonymous (d(S)) substitution rates for the regions flanking the glutamine tandem repeats and the proteins containing them. The results indicated that most of the proteins containing polyglutamine (polyQ) tracts of four or more glutamines have undergone purifying selection, and that the purifying selection for the regions flanking the repeats is weaker. Additionally, we observed that the conserved repeats were under stronger selection constraints than the nonconserved repeats. Interestingly, we found that there was a higher level of purifying selection for the regions flanking the polyQ tracts encoded by pure CAG codons compared with those encoded by mixed codons. Based on our findings, we propose that selection has played a more important role than was previously speculated in constraining the expansion of polyQ tracts encoded by pure codons. 相似文献
6.
Edward N. Trifonov Alla Kirzhner Valery M. Kirzhner Igor N. Berezovsky 《Journal of molecular evolution》2001,53(4-5):394-401
Evolution of proteins encoded in nucleotide sequences began with the advent of the triplet code. The chronological order
of the appearance of amino acids on the evolution scene and the steps in the evolution of the triplet code have been recently
reconstructed (Trifonov, 2000b) on the basis of 40 different ranking criteria and hypotheses. According to the consensus chronology,
the pair of complementary GGC and GCC codons for the amino acids alanine and glycine appeared first. Other codons appeared
as complementary pairs as well, which divided their respective amino acids into two alphabets, encoded by triplets with either
central purines or central pyrimidines: G, D, S, E, N, R, K, Q, C, H, Y, and W (Glycine alphabet G) and A, V, P, S, L, T, I, F, and M (Alanine alphabet A). It is speculated that the earliest polypeptide chains were very short, presumably of uniform length, belonging to two alphabet
types encoded in the two complementary strands of the earliest mRNA duplexes. After the fusion of the minigenes, a mosaic
of the alphabets would form. Traces of the predicted mosaic structure have been, indeed, detected in the protein sequences
of complete prokaryotic genomes in the form of weak oscillations with the period 12 residues in the form of alteration of
two types of 6 residue long units. The next stage of protein evolution corresponded to the closure of the chains in the loops
of the size 25–30 residues (Berezovsky et al., 2000). Autocorrelation analysis of proteins of 23 complete archaebacterial
and eubacterial genomes revealed that the preferred distances between valine, alanine, glycine, leucine, and isoleucine along
the sequences are in the same range of 25–30 residues, indicating that the loops are primarily closed by hydrophobic interactions
between the ends of the loops. The loop closure stage is followed by the formation of typical folds of 100–200 amino acids,
via end-to-end fusion of the genes encoding the loop-size chains. This size was apparently dictated by the optimal ring closure
for DNA. In both cases the closure into the ring (loop) rendered evolutionarily advantageous stability to the respective structures.
Further gene fusions lead to the formation of modern multidomain proteins. Recombinational gene splicing is likely to have
appeared after the DNA circularization stage.
Received: 21 December 2000 / Accepted: 28 February 2001 相似文献
7.
Albert Jeltsch 《Journal of molecular evolution》1999,49(1):161-164
Circular permutations of genes during molecular evolution often are regarded as elusive, although a simple model can explain
these rearrangements. The model assumes that first a gene duplication of the precursor gene occurs in such a way that both
genes become fused in frame, leading to a tandem protein. After generation of a new start codon within the 5′ part of the
tandem gene and a stop at an equivalent position in the 3′ part of the gene, a protein is encoded that represents a perfect
circular permutation of the precursor gene product. The model is illustrated here by the molecular evolution of adenine-N6 DNA methyltransferases. β- and γ-type enzymes of this family can be interconverted by a single circular permutation event.
Interestingly, tandem proteins, proposed as evolutionary intermediates during circular permutation, can be directly observed
in the case of adenine methyltransferases, because some enzymes belonging to type IIS, like the FokI methyltransferase, are built up by two fused enzymes, both of which are active independently of each other. The mechanism
for circular permutation illustrated here is very easy and applicable to every protein. Thus, circular permutation can be
regarded as a normal process in molecular evolution and a changed order of conserved amino acid motifs should not be interpreted
to argue against divergent evolution.
Received: 17 November 1998 / Accepted: 19 February 1999 相似文献
8.
9.
In many unicellular organisms, invertebrates, and plants, synonymous codon usage biases result from a coadaptation between
codon usage and tRNAs abundance to optimize the efficiency of protein synthesis. However, it remains unclear whether natural
selection acts at the level of the speed or the accuracy of mRNAs translation. Here we show that codon usage can improve the
fidelity of protein synthesis in multicellular species. As predicted by the model of selection for translational accuracy,
we find that the frequency of codons optimal for translation is significantly higher at codons encoding for conserved amino
acids than at codons encoding for nonconserved amino acids in 548 genes compared between Caenorhabditis elegans and Homo sapiens. Although this model predicts that codon bias correlates positively with gene length, a negative correlation between codon
bias and gene length has been observed in eukaryotes. This suggests that selection for fidelity of protein synthesis is not
the main factor responsible for codon biases. The relationship between codon bias and gene length remains unexplained. Exploring
the differences in gene expression process in eukaryotes and prokaryotes should provide new insights to understand this key
question of codon usage.
Received: 18 June 2000 / Accepted: 10 November 2000 相似文献
10.
A DNA fragment containing short tandem repeat sequences (approximately 86-bp repeat) was isolated from a Xenopus laevis cDNA library. Southern blot and in situ hybridization analyses revealed that the repeat was highly dispersed in the genome and was present at approximately 1 million
copies per haploid genome. We named this element Xstir (Xenopus short tandemly and invertedly repeating element) after its arrangement in the genome. The majority of the genomic Xstir sequences
were digested to monomer and dimer sizes with several restriction enzymes. Their sequences were found to be highly homogeneous
and organized into tandem arrays in the genome. Alignment analyses of several known sequences showed that some of the Xstir-like
sequences were also organized into interspersed inverted repeats. The inverted repeats consisted of an inverted pair of two
differently modified Xstirs separated by a short insert. In addition, these were framed by another novel inverted repeat (Xstir-TIR).
The Xstir-TIR sequence was also found at the ends of tandem Xstir arrays. Furthermore, we found that Xstir-TIR was linked
to a motif characterizing the T2 family which belonged to a vertebrate MITE (miniature inverted-repeat transposable element)
family, suggesting the importance of Xstir-TIR for their amplification and transposition. The present study of 11 anuran and
2 urodele species revealed that Xstir or Xstir-like sequences were extensively amplified in the three Xenopus species. Genomic Xstir populations of X. borealis and X. laevis were mutually indistinguishable but significantly different from that of X. tropicalis.
Received: 5 April 2000 / Accepted: 3 August 2000 相似文献
11.
Tandemly repeated sequences are a major component of the eukaryotic genome. Although the general characteristics of tandem
repeats have been well documented, the processes involved in their origin and maintenance remain unknown. In this study, a
region on the paternal sex ratio (PSR) chromosome was analyzed to investigate the mechanisms of tandem repeat evolution. The
region contains a junction between a tandem array of PSR2 repeats and a copy of the retrotransposon NATE, with other dispersed repeats (putative mobile elements) on the other side of the element. Little similarity was detected
between the sequence of PSR2 and the region of NATE flanking the array, indicating that the PSR2 repeat did not originate from the underlying NATE sequence. However, a short region of sequence similarity (11/15 bp) and an inverted region of sequence identity (8 bp) are
present on either side of the junction. These short sequences may have facilitated nonhomologous recombination between NATE and PSR2, resulting in the formation of the junction. Adjacent to the junction, the three most terminal repeats in the PSR2
array exhibited a higher sequence divergence relative to internal repeats, which is consistent with a theoretical prediction
of the unequal exchange model for tandem repeat evolution. Other NATE insertion sites were characterized which show proximity to both tandem repeats and complex DNAs containing additional dispersed
repeats. An ``accretion model' is proposed to account for this association by the accumulation of mobile elements at the
ends of tandem arrays and into ``islands' within arrays. Mobile elements inserting into arrays will tend to migrate into
islands and to array ends, due to the turnover in the number of intervening repeats.
Received: 18 August 1997 / Accepted: 18 September 1998 相似文献
12.
Sakai H Imamura C Osada Y Saito R Washio T Tomita M 《Journal of molecular evolution》2001,52(2):164-170
In this study, we analyzed the correlation between codon usage bias and Shine–Dalgarno (SD) sequence conservation, using
complete genome sequences of nine prokaryotes. For codon usage bias, we adopted the codon adaptation index (CAI), which is
based on the codon usage preference of genes encoding ribosomal proteins, elongation factors, heat shock proteins, outer membrane
proteins, and RNA polymerase subunit proteins. To compute SD sequence conservation, we used SD motif sequences predicted by
Tompa and systematically aligned them with 5′UTR sequences. We found that there exists a clear correlation between the CAI
values and SD sequence conservation in the genomes of Escherichia coli, Bacillus subtilis, Haemophilus influenzae, Archaeoglobus fulgidus, Methanobacterium thermoautotrophicum, and Methanococcus jannaschii, and no relationship is found in M. genitalium, M. pneumoniae, and Synechocystis. That is, genes with higher CAI values tend to have more conserved SD sequences than do genes with lower CAI values in these
organisms. Some organisms, such as M. thermoautotrophicum, do not clearly show the correlation. The biological significance of these results is discussed in the context of the translation
initiation process and translation efficiency.
Received: 22 June 2000 / Accepted: 18 October 2000 相似文献
13.
Shuji Shigenobu Hidemi Watanabe Yoshiyuki Sakaki Hajime Ishikawa 《Journal of molecular evolution》2001,53(4-5):377-386
Endosymbiotic bacteria live in animal cells and are transmitted vertically at the time of the host's reproduction. In view
of their small and asexual populations with infrequent chances of recombination, these endocellular bacteria are expected
to accumulate mildly deleterious mutations. Previous studies showed that the DNA sequences of these bacteria evolved faster
than those of free-living bacteria. In this study, we compared all the ORFs of Buchnera, an endocellular bacterial symbiont of aphids, with those of 34 other prokaryotic organisms and estimated the effect of the
accelerated evolution of Buchnera on the functions of its proteins. It was revealed that Buchnera proteins contain many mutations at the sites where sequences are conserved in their orthologues in many other organisms.
In addition, amino acid replacements at the conserved sites are mostly changes to physicochemically different amino acids.
These results suggest that functions and conformations of Buchnera proteins have been seriously impaired or strongly modified. Indeed, extensive loss of functional motifs was observed in some
Buchnera proteins. In many Buchnera proteins mutations were not detected evenly throughout each molecule but tended to accumulate in some functional units, possibly
leading to loss of specific functions. As Buchnera has an unusual and limited gene repertory, it is conceivable that the manner of interactions among its proteins has been
changed, and thus, functional constraints over their amino acid residues have also been changed during evolution. This may
account for the loss of some functional units only in the Buchnera proteins. We obtained evidence that amino acid replacements in Buchnera were not always deleterious, but neutral or, in some cases, even positively selected.
Received: 14 December 2000 / Accepted: 12 March 2001 相似文献
14.
Wolbachia are obligatory intracellular and maternally inherited bacteria, known to infect many species of arthropod. In this study,
we discovered a bacteriophage-like genetic element in Wolbachia, which was tentatively named bacteriophage WO. The phylogenetic tree based on phage WO genes of several Wolbachia strains was not congruent with that based on chromosomal genes of the same strains, suggesting that phage WO was active and
horizontally transmitted among various Wolbachia strains. All the strains of Wolbachia used in this study were infected with phage WO. Although the phage genome contained genes of diverse origins, the average
G+C content and codon usage of these genes were quite similar to those of a chromosomal gene of Wolbachia. These results raised the possibility that phage WO has been associated with Wolbachia for a very long time, conferring some benefit to its hosts. The evolution and possible roles of phage WO in various reproductive
alterations of insects caused by Wolbachia are discussed.
Received: 28 January 2000 / Accepted: 3 August 2000 相似文献
15.
de Miranda AB Alvarez-Valin F Jabbari K Degrave WM Bernardi G 《Journal of molecular evolution》2000,50(1):45-55
Mycobacterium tuberculosis and Mycobacterium leprae are the ethiological agents of tuberculosis and leprosy, respectively. After performing extensive comparisons between genes
from these two GC-rich bacterial species, we were able to construct a set of 275 homologous genes. Since these two bacterial
species also have a very low growth rate, translational selection could not be so determinant in their codon preferences as
it is in other fast-growing bacteria. Indeed, principal-components analysis of codon usage from this set of homologous genes
revealed that the codon choices in M. tuberculosis and M. leprae are correlated not only with compositional constraints and translational selection, but also with the degree of amino acid
conservation and the hydrophobicity of the encoded proteins. Finally, significant correlations were found between GC3 and synonymous distances as well as between synonymous and nonsynonymous distances.
Received: 30 October 1998 / Accepted: 16 August 1999 相似文献
16.
Telomeres of most insects are composed of simple (TTAGG)
n
repeats that are synthesized by telomerase. However, in some dipteran insects such as Drosophila melanogaster, (TTAGG)
n
repeats or telomerase activity has not been detected. Although telomere structure is well documented in Diptera and Lepidoptera,
very limited information is available on lower insect groups. To understand general aspects of telomere function and evolution
in insects, we endeavored to characterize structures of the telomeric and subtelomeric regions in a lower insect, the Taiwan
cricket, Teleogryllus taiwanemma. FISH analysis of this insect's chromosomes demonstrated (TTAGG)
n
repeat elements in all distal ends. Just proximal to the telomeric repeats, the highly conserved 9-kb long terminal unit
(LTU) sequences are tandemly repeated. These were observed in four of six chromosomes, three autosomal ends, and one X-chromosomal
end. LTU sequences represent about 0.2% of the T. taiwanemma genome. Each LTU contains a core (TTAGG)8-like sequence (TRLS) and five types of conserved sequences—ST (short telomere associated), J (joint), X, SR (satellite sequence
rich), and Y—which vary in length from about 150 bp to 2.7 kb. The LTU sequence is defined as ST–J–TRLS–SR–X–Y–X–Y–X. Most
LTU regions may be derived from the ancestral common sequence, which is observed in ST regions six times and at many other
LTU sites. We could not find the LTU-like sequence in three other crickets including the closest species, T. emma, suggesting that the LTU in T. taiwanemma has been rapidly amplified in subtelomeric regions through recent evolutional events. It is also suggested that the highly
conserved structure of the LTU is maintained by recombination and may contribute to telomere elongation, as seen in dipteran
insects.
Received: 6 August 2001/Accepted: 10 October 2001 相似文献
17.
Jenkins GM Pagel M Gould EA de A Zanotto PM Holmes EC 《Journal of molecular evolution》2001,52(4):383-390
The extent to which base composition and codon usage vary among RNA viruses, and the possible causes of this bias, is undetermined
in most cases. A maximum-likelihood statistical method was used to test whether base composition and codon usage bias covary
with arthropod association in the genus Flavivirus, a major source of disease in humans and animals. Flaviviruses are transmitted by mosquitoes, by ticks, or directly between
vertebrate hosts. Those viruses associated with ticks were found to have a significantly lower G+C content than non-vector-borne
flaviviruses and this difference was present throughout the genome at all amino acids and codon positions. In contrast, mosquito-borne
viruses had an intermediate G+C content which was not significantly different from those of the other two groups. In addition,
biases in dinucleotide and codon usage that were independent of base composition were detected in all flaviviruses, but these
did not covary with arthropod association. However, the overall effect of these biases was slight, suggesting only weak selection
at synonymous sites. A preliminary analysis of base composition, codon usage, and vector specificity in other RNA virus families
also revealed a possible association between base composition and vector specificity, although with biases different from
those seen in the Flavivirus genus.
Received: 29 August 2000 / Accepted: 19 December 2000 相似文献
18.
Amino acid residues arginine (R) and lysine (K) have similar physicochemical characteristics and are often mutually substituted
during evolution without affecting protein function. Statistical examinations on human proteins show that more R than K residues
are used in the proximity of R residues, whereas more K than R are used near K residues. This biased use occurs on both a
global and a local scale (shorter than ∼100 residues). Even within a given exon, G + C-rich and A + T-rich short DNA segments
preferentially encode R and K, respectively. The biased use of R and K on a local scale is also seen in Saccharomyces cerevisiae and Caenorhabdidtis elegans, which lack global-scale mosaic structures with varying GC%, or isochores. Besides R and K, several amino acids are also used
with a positive or negative correlation with the local GC% of third codon bases. The local-, or ``within-gene'-, scale heterogeneity
of the DNA sequence may influence the sequence of the encoded protein segment.
Received: 2 March 1998 / Accepted: 23 April 1998 相似文献
19.
Single-amino-acid tandem repeats are very common in mammalian proteins but their function and evolution are still poorly understood. Here we investigate how the variability and prevalence of amino acid repeats are related to the evolutionary constraints operating on the proteins. We find a significant positive correlation between repeat size difference and protein nonsynonymous substitution rate in human and mouse orthologous genes. This association is observed for all the common amino acid repeat types and indicates that rapid diversification of repeat structures, involving both trinucleotide slippage and nucleotide substitutions, preferentially occurs in proteins subject to low selective constraints. However, strikingly, we also observe a significant negative correlation between the number of repeats in a protein and the gene nonsynonymous substitution rate, particularly for glutamine, glycine, and alanine repeats. This implies that proteins subject to strong selective constraints tend to contain an unexpectedly high number of repeats, which tend to be well conserved between the two species. This is consistent with a role for selection in the maintenance of a significant number of repeats. Analysis of the codon structure of the sequences encoding the repeats shows that codon purity is associated with high repeat size interspecific variability. Interestingly, polyalanine and polyglutamine repeats associated with disease show very distinctive features regarding the degree of repeat conservation and the protein sequence selective constraints. 相似文献
20.
Retrovirus-like sequences and their solitary (solo) long terminal repeats (LTRs) are common repetitive elements in eukaryotic
genomes. We reported previously that the tandemly arrayed genes encoding U2 snRNA (the RNU2 locus) in humans and apes contain a solo LTR (U2-LTR) which was presumably generated by homologous recombination between
the two LTRs of an ancestral provirus that is retained in the orthologous baboon RNU2 locus. We have now sequenced the orthologous U2-LTRs in human, chimpanzee, gorilla, orangutan, and baboon and examined numerous
homologs of the U2-LTR that are dispersed throughout the human genome. Although these U2-LTR homologs have been collectively
referred to as LTR13 in the literature, they do not display sequence similarity to any known retroviral LTRs; however, the
structure of LTR13 closely resembles that of other retroviral LTRs with a putative promoter, polyadenylation signal, and a
tandemly repeated 53-bp enhancer-like element. Genomic blotting indicates that LTR13 is primate-specific; based on sequence
analysis, we estimate there are about 2,500 LTR13 elements in the human genome. Comparison of the primate U2-LTR sequences
suggests that the homologous recombination event that gave rise to the solo U2-LTR occurred soon after insertion of the ancestral
provirus into the ancestral U2 tandem array. Phylogenetic analysis of the LTR13 family confirms that it is diverse, but the
orthologous U2-LTRs form a coherent group in which chimpanzee is closest to the humans; orangutan is a clear outgroup of human,
chimpanzee, and gorilla; and baboon is a distant relative of human, chimpanzee, gorilla, and orangutan. We compare the LTR13
family with other known LTRs and consider whether these LTRs might play a role in concerted evolution of the primate RNU2 locus.
Received: 29 September 1997 / Accepted: 16 January 1998 相似文献