首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We isolated RNAs by selection–amplification, selecting for affinity to Phe–Sepharose and elution with free l-phenylalanine. Constant sequences did not contain Phe condons or anticodons, to avoid any possible confounding influence on initially randomized sequences. We examined the eight most frequent Phe-binding RNAs for inclusion of coding triplets. Binding sites were defined by nucleotide conservation, protection, and interference data. Together these RNAs comprise 70% of the 105 sequenced RNAs. The K D for the strongest sites is ≈50 μM free amino acid, with strong stereoselectivity. One site strongly distinguishes free Phe from Trp and Tyr, a specificity not observed previously. In these eight Phe-binding RNAs, Phe codons are not significantly associated with Phe binding sites. However, among 21 characterized RNAs binding Phe, Tyr, Arg, and Ile, containing 1342 total nucleotides, codons are 2.7-fold more frequent within binding sites than in surrounding sequences in the same molecules. If triplets were not specifically related to binding sites, the probability of this distribution would be 4.8 × 10−11. Therefore, triplet concentration within amino acid binding sites taken together is highly likely. In binding sites for Arg, Tyr, and Ile cognate codons are overrepresented. Thus Arg, Tyr, and Ile may be amino acids whose codons were assigned during an era of direct RNA–amino acid affinity. In contrast, Phe codons arguably were assigned by another criterion, perhaps during later code evolution.  相似文献   

2.
Biased codon usage is common in eukaryotic and prokaryotic genes. Evidence from Escherichia, Saccharomyces, and Drosophila indicates that it favors translational efficiency and accuracy. However, to date no functional advantages have been identified in the codon–anticodon interactions involving the most frequently used (preferred) codons. Here we present evidence that forces not related to the individual codon–anticodon interaction may be involved in determining which synonymous codons are preferred or avoided. We show that the ``off-frame' trinucleotide motif preferences inferrable from Drosophila coding regions are often in the same direction as Drosophila's ``in-frame' codon preferences, i.e., its codon usage. The off-frame preferences were inferred from the nonrandomness of the location of confamilial synonymous codons along coding regions—a pattern often described as a context dependence of nucleotide choice at synonymous positions or as codon-pair bias. We relied on randomizations of the location of confamilial codons that do not alter, and cannot be influenced by, the encoded amino acid sequences, codon usage, or base composition of the genes examined. The statistically significant congruency of in-frame and off-frame trinucleotide preferences suggests that the same kind of reading-frame-independent force(s) may also influence synonymous codon choice. These forces may have produced biases in codon usage that then led to the evolution of the translational advantages of these motifs as preferred codons. Under this scenario, tRNA pool size differences between preferred and nonpreferred codons initially were evolved to track the default overrepresentation of codons with preferred motifs. The motif preference hypothesis can explain the structuring of codon preferences and the similarities in the codon usages of distantly related organisms. Received: 10 November 1998 / Accepted: 23 February 1999  相似文献   

3.
Amino acid residues arginine (R) and lysine (K) have similar physicochemical characteristics and are often mutually substituted during evolution without affecting protein function. Statistical examinations on human proteins show that more R than K residues are used in the proximity of R residues, whereas more K than R are used near K residues. This biased use occurs on both a global and a local scale (shorter than ∼100 residues). Even within a given exon, G + C-rich and A + T-rich short DNA segments preferentially encode R and K, respectively. The biased use of R and K on a local scale is also seen in Saccharomyces cerevisiae and Caenorhabdidtis elegans, which lack global-scale mosaic structures with varying GC%, or isochores. Besides R and K, several amino acids are also used with a positive or negative correlation with the local GC% of third codon bases. The local-, or ``within-gene'-, scale heterogeneity of the DNA sequence may influence the sequence of the encoded protein segment. Received: 2 March 1998 / Accepted: 23 April 1998  相似文献   

4.
We characterized a full-length gene encoding wild silkmoth Antheraea pernyi fibroin (Ap-fibroin) to clarify the conformation of repetitive sequences. The gene consisted of a first exon encoding 14 amino acid residues, a short intron (120 bp), and a long second exon encoding 2,625 amino acid residues. Three amino acids, alanine, glycine, and serine, amounted to 81% of the Ap-fibroin sequence. The Ap-fibroin, except for 155 residues of the amino terminus, was composed of 80 tandemly arranged polyalanine-containing units (motifs). A motif was a doublet of a polyalanine block (PAB) and a nonpolyalanine block (NPAB). Seventy-eight of the 80 motifs were classified into four types based on differences in the NPAB sequences. Although respective motifs were significantly conserved, many rearrangements were observed within the second exon, i.e., the triplication of a 558-bp-long sequence and other duplication events of shorter sequences. Chi-like sequences, GCTGGAG, might contribute to the rearrangement within the gene as described in human minisatellite loci, because they were found at specific sites of NPAB-encoding sequences in three of four types of motifs. The present results support the idea that the Ap-fibroin gene is unstable like minisatellite sequences and that the evolution of this gene is strongly associated with its instability. Received: 18 February 2000 / Accepted: 30 June 2000  相似文献   

5.
Endosymbiotic bacteria live in animal cells and are transmitted vertically at the time of the host's reproduction. In view of their small and asexual populations with infrequent chances of recombination, these endocellular bacteria are expected to accumulate mildly deleterious mutations. Previous studies showed that the DNA sequences of these bacteria evolved faster than those of free-living bacteria. In this study, we compared all the ORFs of Buchnera, an endocellular bacterial symbiont of aphids, with those of 34 other prokaryotic organisms and estimated the effect of the accelerated evolution of Buchnera on the functions of its proteins. It was revealed that Buchnera proteins contain many mutations at the sites where sequences are conserved in their orthologues in many other organisms. In addition, amino acid replacements at the conserved sites are mostly changes to physicochemically different amino acids. These results suggest that functions and conformations of Buchnera proteins have been seriously impaired or strongly modified. Indeed, extensive loss of functional motifs was observed in some Buchnera proteins. In many Buchnera proteins mutations were not detected evenly throughout each molecule but tended to accumulate in some functional units, possibly leading to loss of specific functions. As Buchnera has an unusual and limited gene repertory, it is conceivable that the manner of interactions among its proteins has been changed, and thus, functional constraints over their amino acid residues have also been changed during evolution. This may account for the loss of some functional units only in the Buchnera proteins. We obtained evidence that amino acid replacements in Buchnera were not always deleterious, but neutral or, in some cases, even positively selected. Received: 14 December 2000 / Accepted: 12 March 2001  相似文献   

6.
Evolution of the triplet code is reconstructed on the basis of consensus temporal order of appearance of amino acids. Several important predictions are confirmed by computational sequence analyses. The earliest amino acids, alanine and glycine, have been encoded by GCC and GGC codons, as today. They were succeeded, respectively, by A- and G-series of amino acids, encoded by pyrimidine-central and purine-central codons. The length of the earliest proteins is estimated to be 6–7 residues. The earliest mRNAs were short G+C-rich molecules. These short sequences could have formed hairpins. This is confirmed by analysis of modern prokaryotic mRNA sequences. Predominant size of detected ancient hairpins also corresponds to 6–7 amino acids, as above. Vestiges of last common ancestor can be found in extant proteins in form of entirely conserved short sequences of size six to nine residues present in all or almost all sequenced prokaryotic proteomes (omnipresent motifs). The functions of the topmost conserved octamers are not involved in the basic elementary syntheses. This suggests an initial abiotic supply of amino acids, bases and sugars. Presented at: National Workshop on Astrobiology: Search for Life in the Solar System, Capri, Italy, 26 to 28 October, 2005.  相似文献   

7.
Polyglutamine repeats within proteins are common in eukaryotes and are associated with neurological diseases in humans. Many are encoded by tandem repeats of the codon CAG that are likely to mutate primarily by replication slippage. However, a recent study in the yeast Saccharomyces cerevisiae has indicated that many others are encoded by mixtures of CAG and CAA which are less likely to undergo slippage. Here we attempt to estimate the proportions of polyglutamine repeats encoded by slippage-prone structures in species currently the subject of genome sequencing projects. We find a general excess over random expectation of polyglutamine repeats encoded by tandem repeats of codons. We nevertheless find many repeats encoded by nontandem codon structures. Mammals and Drosophila display extreme opposite patterns. Drosophila contains many proteins with polyglutamine tracts but these are generally encoded by interrupted structures. These structures may have been selected to be resistant to slippage. In contrast, mammals (humans and mice) have a high proportion of proteins in which repeats are encoded by tandem codon structures. In humans, these include most of the triplet expansion disease genes. Received: 17 August 2000 / Accepted: 20 November 2000  相似文献   

8.
9.
The Drosophila fat body protein 2 gene (Fbp2) is an ancient duplication of the alcohol dehydrogenase gene (Adh) which encodes a protein that differs substantially from ADH in its methionine content. In D. melanogaster, there is one methionine in ADH, while there are 51 (20% of all amino acids) in FBP2. Methionine is involved in 46% of amino acid replacements when Fbp2 DNA sequences are compared between D. melanogaster and D. pseudoobscura. Methionine accumulation does not affect conserved residues of the ADH-ADHr-FBP2 multigene family. The multigene family has evolved by replacement of mildly hydrophobic amino acids by methionine with no apparent reversion. Its short-term evolution was compared between two Drosophila species, while its long-term evolution was compared between two genera belonging respectively to acalyptrate and calyptrate Diptera, Drosophila and Sarcophaga. The pattern of nucleotide substitution was consistent with an independent accumulation of methionines at the Fbp2 locus in each lineage. Under a steady-state model, the rate of methionine accumulation was constant in the lineage leading to Drosophila, and was twice as fast as that in the calyptrate lineage. Substitution rates were consistent with a slight positive selective advantage for each methionine change in about one-half of amino acid sites in Drosophila. This shows that selection can potentially account for a large proportion of amino acid replacements in the molecular evolution of proteins. Received: 12 December 1994 / Accepted: 15 April 1996  相似文献   

10.
Numerous RNA binding sites for specific amino acids are now known, coming predominantly from selection-amplification experiments. These sites are chemically discriminating despite being predominantly small, simple RNA structures: internal and bulge loops. Recent studies of sites for hydrophobic side chains suggest that there are other generalizable structural features which recur in hydrophobic RNA sites. Further, sites for hydrophobic side chains can contain codons for the bound amino acid, as has also long been known for the polar amino acid arginine. Such findings are comprehensively reviewed, and the implications for the origin of coded peptide synthesis are considered. An origins hypothesis which accommodates all the data, DRT (direct RNA templating), is formulated. Received: 22 December 1997 / Accepted: 13 February 1998  相似文献   

11.
Porin of Haemophilus influenzae type b (341 amino acids; M r 37782) determines the permeability of the outer membrane to low molecular mass compounds. Purified Hib porin was subjected to chemical modification of lysine residues by succinic anhydride. Electrospray ionization mass spectrometry identified up to 12 modifications per porin molecule. Tryptic digestion of modified Hib porin followed by reverse phase chromatography and matrix assisted laser desorption ionization time-of-flight mass spectrometry mapped the succinylation sites. Most modified lysines are positioned in surface-located loops, numbers 1 and 4 to 7. Succinylated porin was reconstituted into planar lipid bilayers, and biophysical properties were analyzed and compared to Hib porin: there was an increased average single channel conductance compared to Hib porin (1.24+/−0.41 vs. 0.85+/−0.40 nanosiemens). The voltage-gating activity of succinylated porin differed considerably from that of Hib porin. The threshold voltage for gating was decreased from 75 to 40 mV. At 80 mV, steady-state conductance for succinylated porin was 50–55% of the instantaneous conductance. Hib porin at 80 mV showed a decrease to 89–91% of the instantaneous current levels. We propose that surface-located lysine residues are determinants of voltage gating for porin of Haemophilus influenzae type b. Received: 11 August 2000/Revised: 8 September 2000  相似文献   

12.
We show that in animal mitochondria homologous genes that differ in guanine plus cytosine (G + C) content code for proteins differing in amino acid content in a manner that relates to the G + C content of the codons. DNA sequences were analyzed using square plots, a new method that combines graphical visualization and statistical analysis of compositional differences in both DNA and protein. Square plots divide codons into four groups based on first and second position A + T (adenine plus thymine) and G + C content and indicate differences in amino acid content when comparing sequences that differ in G + C content. When sequences are compared using these plots, the amino acid content is shown to correlate with the nucleotide bias of the genes. This amino acid effect is shown in all protein-coding genes in the mitochondrial genome, including cox I, cox II, and cyt b, mitochondrial genes which are commonly used for phylogenetic studies. Furthermore, nucleotide content differences are shown to affect the content of all amino acids with A + T- and G + C-rich codons. We speculate that phylogenetic analysis of genes so affected may tend erroneously to indicate relatedness (or lack thereof) based only on amino acid content. Received: 3 July 1996 / Accepted: 6 November 1996  相似文献   

13.
In many unicellular organisms, invertebrates, and plants, synonymous codon usage biases result from a coadaptation between codon usage and tRNAs abundance to optimize the efficiency of protein synthesis. However, it remains unclear whether natural selection acts at the level of the speed or the accuracy of mRNAs translation. Here we show that codon usage can improve the fidelity of protein synthesis in multicellular species. As predicted by the model of selection for translational accuracy, we find that the frequency of codons optimal for translation is significantly higher at codons encoding for conserved amino acids than at codons encoding for nonconserved amino acids in 548 genes compared between Caenorhabditis elegans and Homo sapiens. Although this model predicts that codon bias correlates positively with gene length, a negative correlation between codon bias and gene length has been observed in eukaryotes. This suggests that selection for fidelity of protein synthesis is not the main factor responsible for codon biases. The relationship between codon bias and gene length remains unexplained. Exploring the differences in gene expression process in eukaryotes and prokaryotes should provide new insights to understand this key question of codon usage. Received: 18 June 2000 / Accepted: 10 November 2000  相似文献   

14.
We surveyed the molecular evolutionary characteristics of 25 plant gene families, with the goal of better understanding general processes in plant gene family evolution. The survey was based on 247 GenBank sequences representing four grass species (maize, rice, wheat, and barley). For each gene family, orthology and paralogy relationships were uncertain. Recognizing this uncertainty, we characterized the molecular evolution of each gene family in four ways. First, we calculated the ratio of nonsynonymous to synonymous substitutions (d N/d S) both on branches of gene phylogenies and across codons. Our results indicated that the d N/d S ratio was statistically heterogeneous across branches in 17 of 25 (68%) gene families. The vast majority of d N/d S estimates were <<1.0, suggestive of selective constraint on amino acid replacements, and no estimates were >1.0, either across phylogenetic lineages or across codons. Second, we tested separately for nonsynonymous and synonymous molecular clocks. Sixty-eight percent of gene families rejected a nonsynonymous molecular clock, and 52% of gene families rejected a synonymous molecular clock. Thus, most gene families in this study deviated from clock-like evolution at either synonymous or nonsynonymous sites. Third, we calculated the effective number of codons and the proportion of G+C synonymous sites for each sequence in each gene family. One or both quantities vary significantly within 18 of 25 gene families. Finally, we tested for gene conversion, and only six gene families provided evidence of gene conversion events. Altogether, evolution for these 25 gene families is marked by selective constraint that varies among gene family members, a lack of molecular clock at both synonymous and nonsynonymous sites, and substantial variation in codon usage. Received: 25 May 2000 / Accepted: 16 October 2000  相似文献   

15.
Highly expressed plastid genes display codon adaptation, which is defined as a bias toward a set of codons which are complementary to abundant tRNAs. This type of adaptation is similar to what is observed in highly expressed Escherichia coli genes and is probably the result of selection to increase translation efficiency. In the current work, the codon adaptation of plastid genes is studied with regard to three specific features that have been observed in E. coli and which may influence translation efficiency. These features are (1) a relatively low codon adaptation at the 5′ end of highly expressed genes, (2) an influence of neighboring codons on codon usage at a particular site (codon context), and (3) a correlation between the level of codon adaptation of a gene and its amino acid content. All three features are found in plastid genes. First, highly expressed plastid genes have a noticeable decrease in codon adaptation over the first 10–20 codons. Second, for the twofold degenerate NNY codon groups, highly expressed genes have an overall bias toward the NNC codon, but this is not observed when the 3′ neighboring base is a G. At these sites highly expressed genes are biased toward NNT instead of NNC. Third, plastid genes that have higher codon adaptations also tend to have an increased usage of amino acids with a high G + C content at the first two codon positions and GNN codons in particular. The correlation between codon adaptation and amino acid content exists separately for both cytosolic and membrane proteins and is not related to any obvious functional property. It is suggested that at certain sites selection discriminates between nonsynonymous codons based on translational, not functional, differences, with the result that the amino acid sequence of highly expressed proteins is partially influenced by selection for increased translation efficiency. Received: 21 July 1999 / Accepted: 5 November 1999  相似文献   

16.
We simulate a deterministic population genetic model for the coevolution of genetic codes and protein-coding genes. We use very simple assumptions about translation, mutation, and protein fitness to calculate mutation-selection equilibria of codon frequencies and fitness in a large asexual population with a given genetic code. We then compute the fitnesses of altered genetic codes that compete to invade the population by translating its genes with higher fitness. Codes and genes coevolve in a succession of stages, alternating between genetic equilibration and code invasion, from an initial wholly ambiguous coding state to a diversified frozen coding state. Our simulations almost always resulted in partially redundant frozen genetic codes. Also, the range of simulated physicochemical properties among encoded amino acids in frozen codes was always less than maximal. These results did not require the assumption of historical constraints on the number and type of amino acids available to codes nor on the complexity of proteins, stereochemical constraints on the translational apparatus, nor mechanistic constraints on genetic code change. Both the extent and timing of amino-acid diversification in genetic codes were strongly affected by the message mutation rate and strength of missense selection. Our results suggest that various omnipresent phenomena that distribute codons over sites with different selective requirements—such as the persistence of nonsynonymous mutations at equilibrium, the positive selection of the same codon in different types of sites, and translational ambiguity—predispose the evolution of redundancy and of reduced amino acid diversity in genetic codes. Received: 21 December 2000 / Accepted: 12 March 2001  相似文献   

17.
The members of the PKA regulatory subunit family (PKA-R family) were analyzed by multiple sequence alignment and clustering based on phylogenetic tree construction. According to the phylogenetic trees generated from multiple sequence alignment of the complete sequences, the PKA-R family was divided into four subfamilies (types I to IV). Members of each subfamily were exclusively from animals (types I and II), fungi (type III), and alveolates (type IV). Application of the same methodology to the cAMP-binding domains, and subsequently to the region delimited by β-strands 6 and 7 of the crystal structures of bovine RIα and rat RIIβ (the phosphate-binding cassette; PBC), proved that this highly conserved region was enough to classify unequivocally the members of the PKA-R family. A single signature sequence, F–G–E–[LIV]–A–L–[LIMV]–x(3)–[PV]–R–[ANQV]–A, corresponding to the PBC was identified which is characteristic of the PKA-R family and is sufficient to distinguish it from other members of the cyclic nucleotide-binding protein superfamily. Specific determinants for the A and B domains of each R-subunit type were also identified. Conserved residues defining the signature motif are important for interaction with cAMP or for positioning the residues that directly interact with cAMP. Conversely, residues that define subfamilies or domain types are not conserved and are mostly located on the loop that connects α-helix B′ and β strand 7. Received: 2 November 2000/Accepted: 14 June 2001  相似文献   

18.
Molecular Evolution of the Myeloperoxidase Family   总被引:4,自引:0,他引:4  
Animal myeloperoxidase and its relatives constitute a diverse protein family, which includes myeloperoxidase, eosinophil peroxidase, thyroid peroxidase, salivary peroxidase, lactoperoxidase, ovoperoxidase, peroxidasin, peroxinectin, cyclooxygenase, and others. The members of this protein family share a catalytic domain of about 500 amino acid residues in length, although some members have distinctive mosaic structures. To investigate the evolution of the protein family, we performed a comparative analysis of its members, using the amino acid sequences and the coordinate data available today. The results obtained in this study are as follows: (1) 60 amino acid sequences belonging to this family were collected by database searching. We found a new member of the myeloperoxidase family derived from a bacterium. This is the first report of a bacterial member of this family. (2) An unrooted phylogenetic tree of the family was constructed according to the alignment. Considering the branching pattern in the obtained phylogenetic tree, together with the mosaic features in the primary structures, 60 members of the myeloperoxidase family were classified into 16 subfamilies. (3) We found two molecular features that distinguish cyclooxygenase from the other members of the protein family. (4) Several structurally deviated segments were identified by a structural comparison between cyclooxygenase and myeloperoxidase. Some of the segments seemed to be associated with the functional and/or structural differences between the enzymes. Received: 25 January 2000 / Accepted: 19 July 2000  相似文献   

19.
Natural selection favors certain synonymous codons which aid translation in Escherichia coli, yet codons not favored by translational selection persist. We use the frequency distributions of synonymous polymorphisms to test three hypotheses for the existence of translationally sub-optimal codons: (1) selection is a relatively weak force, so there is a balance between mutation, selection, and drift; (2) at some sites there is no selection on codon usage, so some synonymous sites are unaffected by translational selection; and (3) translationally sub-optimal codons are favored by alternative selection pressures at certain synonymous sites. We find that when all the data is considered, model 1 is supported and both models 2 and 3 are rejected as sole explanations for the existence of translationally sub-optimal codons. However, we find evidence in favor of both models 2 and 3 when the data is partitioned between groups of amino acids and between regions of the genes. Thus, all three mechanisms appear to contribute to the existence of translationally sub-optimal codons in E. coli. Received: 18 July 2000 / Accepted: 17 April 2001  相似文献   

20.
The phylogenetic placement of the Aquifex and Thermotoga lineages has been inferred from (i) the concatenated ribosomal proteins S10, L3, L4, L23, L2, S19, L22, and S3 encoded in the S10 operon (833 aa positions); (ii) the joint sequences of the elongation factors Tu(1α) and G(2) coded by the str operon tuf and fus genes (733 aa positions); and (iii) the joint RNA polymerase β- and β′-type subunits encoded in the rpoBC operon (1130 aa positions). Phylogenies of r-protein and EF sequences support with moderate (r-proteins) to high statistical confidence (EFs) the placement of the two hyperthermophiles at the base of the bacterial clade in agreement with phylogenies of rRNA sequences. In the more robust EF-based phylogenies, the branching of Aquifex and Thermotoga below the successive bacterial lineages is given at bootstrap proportions of 82% (maximum likelihood; ML) and 85% (maximum parsimony; MP), in contrast to the trees inferred from the separate EF-Tu(1α) and EF-G(2) data sets, which lack both resolution and statistical robustness. In the EF analysis MP outperforms ML in discriminating (at the 0.05 level) trees having A. pyrophilus and T. maritima as the most basal lineages from competing alternatives that have (i) mesophiles, or the Thermus genus, as the deepest bacterial radiation and (ii) a monophyletic A. pyrophilusT. maritima cluster situated at the base of the bacterial clade. RNAP-based phylogenies are equivocal with respect to the Aquifex and Thermotoga placements. The two hyperthermophiles fall basal to all other bacterial phyla when potential artifacts contributed by the compositionally biased and fast-evolving Mycoplasma genitalium and Mycoplasma pneumoniae sequences are eschewed. However, the branching order of the phyla is tenuously supported in ML trees inferred by the exhaustive search method and is unresolved in ML trees inferred by the quartet puzzling algorithm. A rooting of the RNA polymerase-subunit tree at the mycoplasma level seen in both the MP trees and the ML trees reconstructed with suboptimal amino acid substitution models is not supported by the EF-based phylogenies which robustly affiliate mycoplasmas with low-G+C gram-positives and, most probably, reflects a ``long branch attraction' artifact. Received: 22 September 1999 / Accepted: 11 January 2000  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号