共查询到20条相似文献,搜索用时 0 毫秒
1.
Nucleotide Composition Bias Affects Amino Acid Content in Proteins Coded by Animal Mitochondria 总被引:16,自引:0,他引:16
We show that in animal mitochondria homologous genes that differ in guanine plus cytosine (G + C) content code for proteins
differing in amino acid content in a manner that relates to the G + C content of the codons. DNA sequences were analyzed using
square plots, a new method that combines graphical visualization and statistical analysis of compositional differences in
both DNA and protein. Square plots divide codons into four groups based on first and second position A + T (adenine plus thymine)
and G + C content and indicate differences in amino acid content when comparing sequences that differ in G + C content. When
sequences are compared using these plots, the amino acid content is shown to correlate with the nucleotide bias of the genes.
This amino acid effect is shown in all protein-coding genes in the mitochondrial genome, including cox I, cox II, and cyt b, mitochondrial genes which are commonly used for phylogenetic studies. Furthermore, nucleotide content differences are shown
to affect the content of all amino acids with A + T- and G + C-rich codons. We speculate that phylogenetic analysis of genes
so affected may tend erroneously to indicate relatedness (or lack thereof) based only on amino acid content.
Received: 3 July 1996 / Accepted: 6 November 1996 相似文献
2.
Near Homogeneity of PR2-Bias Fingerprints in the Human Genome and Their Implications in Phylogenetic Analyses 总被引:1,自引:0,他引:1
Noboru Sueoka 《Journal of molecular evolution》2001,53(4-5):469-476
Genes of a multicellular organism are heterogeneous in the G+C content, which is particularly true in the third codon position.
The extent of deviation from intra-strand equality rule of A = T and G = C (Parity Rule 2, or PR2) is specific for individual amino acids and has been expressed as the PR2-bias fingerprint. Previous
results suggested that the PR2-bias fingerprints tend to be similar among the genes of an organism, and the fingerprint of
the organism is specific for different taxa, reflecting phylogenetic relationships of organisms. In this study, using coding
sequences of a large number of human genes, we examined the intragenomic heterogeneity of their PR2-bias fingerprints in relation
to the G+C content of the third codon position (P
3
). Result shows that the PR2-bias fingerprint is similar in the wide range of the G+C content at the third codon position
(0.30–0.80). This range covers approximately 89% of the genes, and further analysis of the high G+C range (0.80–1.00), where
genes with normal PR2-bias fingerprints and those with anomalous fingerprints are mixed, shows that the total of 95% of genes
have the similar finger prints. The result indicates that the PR2-bias fingerprint is a unique property of an organism and
represents the overall characteristics of the genome. Combined with the previous results that the evolutionary change of the
PR2-bias fingerprint is a slow process, PR2-bias fingerprints may be used for the phylogenetic analyses to supplement and
augment the conventional methods that use the differences of the sequences of orthologous proteins and nucleic acids. Potential
advantages and disadvantages of the PR2-bias fingerprint analysis are discussed.
Received: 21 December 2000 / Accepted: 16 February 2001 相似文献
3.
4.
Jenkins GM Pagel M Gould EA de A Zanotto PM Holmes EC 《Journal of molecular evolution》2001,52(4):383-390
The extent to which base composition and codon usage vary among RNA viruses, and the possible causes of this bias, is undetermined
in most cases. A maximum-likelihood statistical method was used to test whether base composition and codon usage bias covary
with arthropod association in the genus Flavivirus, a major source of disease in humans and animals. Flaviviruses are transmitted by mosquitoes, by ticks, or directly between
vertebrate hosts. Those viruses associated with ticks were found to have a significantly lower G+C content than non-vector-borne
flaviviruses and this difference was present throughout the genome at all amino acids and codon positions. In contrast, mosquito-borne
viruses had an intermediate G+C content which was not significantly different from those of the other two groups. In addition,
biases in dinucleotide and codon usage that were independent of base composition were detected in all flaviviruses, but these
did not covary with arthropod association. However, the overall effect of these biases was slight, suggesting only weak selection
at synonymous sites. A preliminary analysis of base composition, codon usage, and vector specificity in other RNA virus families
also revealed a possible association between base composition and vector specificity, although with biases different from
those seen in the Flavivirus genus.
Received: 29 August 2000 / Accepted: 19 December 2000 相似文献
5.
A survey of the patterns of synonymous codon preference in the HIV env gene reveals a correlation between the codon bias and the mutability requirements of different regions of the protein. At
hypervariable regions in gp120 one finds a greater proportion of codons that tend to mutate nonsynonymously, but to a target
that is similar in hydrophobicity and volume. We argue that this strategy results from a compromise between the selective
pressure placed on the virus by the induced immune response, which favors amino acid substitutions in the complementarity
determining regions, and the negative selection against missense mutations that violate structural constraints of the env protein.
Received: 9 June 1997 / Accepted: 25 May 1998 相似文献
6.
The primary and secondary structure of the small-subunit ribosomal RNA (ssrRNA) gene from the naked, marine amoeba, Vannella anglica (subclass Gymnamoebia), was determined. The ssrRNA is 1962 nucleotides in length, with a low G+C content of 37.1%. The ssrRNA
is composed of several uncommon secondary structure features including helix E8-1, which may be a useful target for rRNA probes
for the direct identification of isolates in mixed culture. Phylogenetic analysis of sequence data showed that V. anglica branched prior to the rapid diversification of the eukaryotes. It did not associate with the other naked, lobose amoebae
represented by Acanthamoeba and Hartmannella, indicating that Vannella represents a separate amoeboid lineage and the subclass Gymnamoebia is polyphyletic.
Received: 9 July 1998 / Accepted: 16 November 1998 相似文献
7.
Seiichi Taguchi Shuichi Kojima Mahito Terabe Yoshinori Kumazawa Hiroshi Kohriyama Masayuki Suzuki Kin-ichiro Miura Haruo Momose 《Journal of molecular evolution》1997,44(5):542-551
We previously found that proteinaceous protease inhibitors homologous to Streptomyces subtilisin inhibitor (SSI) are widely produced by various Streptomyces species, and we designated them ``SSI-like proteins' (Taguchi S, Kikuchi H, Suzuki M, Kojima S, Terabe M, Miura K, Nakase
T, Momose H [1993] Appl Environ Microbiol 59:4338–4341). In this study, SSI-like proteins from five strains of the genus Streptoverticillium were purified and sequenced, and molecular phylogenetic trees were constructed on the basis of the determined amino acid
sequences together with those determined previously for Streptomyces species. The phylogenetic trees showed that SSI-like proteins from Streptoverticillium species are phylogenetically included in Streptomyces SSI-like proteins but form a monophyletic group as a distinct lineage within the Streptomyces proteins. This provides an alternative phylogenetic framework to the previous one based on partial small ribosomal RNA sequences,
and it may indicate that the phylogenetic affiliation of the genus Streptoverticillium should be revised. The phylogenetic trees also suggested that SSI-like proteins possessing arginine or methionine at the
P1 site, the major reactive center site toward target proteases, arose multiple times on independent lineages from ancestral
proteins possessing lysine at the P1 site. Most of the codon changes at the P1 site inferred to have occurred during the evolution
of SSI-like proteins are consistent with those inferred from the extremely high G + C content of Streptomyces genomes. The inferred minimum number of amino acid replacements at the P1 site was nearly equal to the average number for
all the variable sites. It thus appears that positive Darwinian selection, which has been postulated to account for accelerated
rates of amino acid replacement at the major reaction center site of mammalian protease inhibitors, may not have dictated
the evolution of the bacterial SSI-like proteins.
Received: 23 August 1996 / Accepted: 20 November 1996 相似文献
8.
Base composition is not uniform across the genome of Drosophila melanogaster. Earlier analyses have suggested that there is variation in composition in D. melanogaster on both a large scale and a much smaller, within-gene, scale. Here we present analyses on 117 genes which have reliable intron/exon
boundaries and no known alternative splicing. We detect significant heterogeneity in G+C content among intron segments from
the same gene, as well as a significant positive correlation between the intron and the third codon position G+C content within
genes. Both of these observations appear to be due, in part, to an overall decline in intron and third codon position G+C
content along Drosophila genes with introns. However, there is also evidence of an increase in third codon position G+C content at the start of genes;
this is particularly evident in genes without introns. This is consistent with selection acting against preferred codons at
the start of genes.
Received: 24 February 1997 / Accepted: 10 November 1997 相似文献
9.
Allan M. Crawford Steven M. Kappes Korena A. Paterson Mauricio J. deGotari Ken G. Dodds Brad A. Freking Roger T. Stone Craig W. Beattie 《Journal of molecular evolution》1998,46(2):256-260
Previous studies suggest the median allele length of microsatellites is longest in the species from which the markers were
derived, suggesting that an ascertainment bias was operating. We have examined whether the size distribution of microsatellite
alleles between sheep and cattle is source dependent using a set of 472 microsatellites that can be amplified in both species.
For those markers that were polymorphic in both species we report a significantly greater number of markers (P < 0.001) with longer median allele sizes in sheep, regardless of microsatellite origin. This finding suggests that any ascertainment
bias operating during microsatellite selection is only a minor contributor to the variation observed.
Received: 6 January 1997 / Accepted: 19 May 1997 相似文献
10.
Jen-Jen Lin Tzung-Horng Yang Benjamin D. Wahlstrand Peter A. Fields George N. Somero 《Journal of molecular evolution》2002,54(1):107-117
Unlike birds and mammals, teleost fish express two paralogous isoforms (paralogues) of cytosolic malate dehydrogenase (cMDH;
EC 1.1.1.37; NAD+: malate oxidoreductase) whose evolutionary relationships to the single cMDH of tetrapods are unknown. We sequenced complementary
DNAs for both cMDHs and the mitochondrial isoform (mMDH) of the fish Sphyraena idiastes (south temperate barracuda) and compared the sequences, kinetic properties, and thermal stabilities of the three isoforms
with those of mammalian orthologues. Both fish cMDHs comprise 333 residues and have subunit masses of approximately 36 kDa.
One cytosolic isoform, cMDH-S, was significantly more heat-stable than either the other cMDH (cMDH-L) or mMDH. In contradiction
to the generally accepted model of vertebrate cMDH evolution, our phylogenetic analysis indicates that the duplication of
the fish cytosolic paralogues occurred after the divergence of the lineages leading to teleosts and tetrapods. cMDH-L and
cMDH-S differed in optimal concentrations of substrates and cofactors and apparent Michaelis–Menten constants, suggesting
that the two paralogues may play distinct physiological roles. Differences in intrinsic thermal stability among MDH paralogues
may reflect different degrees of stabilization in vivo by extrinsic stabilizers, notably protein concentration in the case
of mMDH. Thermal stabilities of porcine mMDH and cMDH-L, but not cMDH-S, were significantly increased when denaturation was
measured at a high protein (bovine serum albumin; BSA) concentration, but the BSA-induced stabilization reduced the catalytic
activity.
Received: 5 April 2001 / Accepted: 28 June 2001 相似文献
11.
The mitochondrial DNA-encoded cytochrome oxidase subunit I (COI) gene and the nuclear DNA-encoded hsp60 gene from the euglenoid
protozoan Euglena gracilis were cloned and sequenced. The COI sequence represents the first example of a mitochondrial genome-encoded gene from this
organism. This gene contains seven TGG tryptophan codons and no TGA tryptophan codons, suggesting the use of the universal
genetic code. This differs from the situation in the mitochondrion of the related kinetoplastid protozoa, in which TGA codes
for tryptophan. In addition, a complete absence of CGN triplets may imply the lack of the corresponding tRNA species. COI
cDNAs from E. gracilis possess short 5′ and 3′ untranslated transcribed sequences and lack a 3′ poly[A] tail.
The COI gene does not require uridine insertion/deletion RNA editing, as occurs in kinetoplastid mitochondria, to be functional,
and no short guide RNA-like molecules could be visualized by labeling total mitochondrial RNA with [α-32P]GTP and guanylyl transferase. In spite of the differences in codon usage and the 3′ end structures of mRNAs, phylogenetic
analysis using the COI and hsp60 protein sequences suggests a monophyletic relationship between the mitochondrial genomes
of E. gracilis and of the kinetoplastids, which is consistent with the phylogenetic relationship of these groups previously obtained using
nuclear ribosomal RNA sequences.
Received: 5 March 1996 / Accepted: 31 July 1996 相似文献
12.
Cristina M. Justice Zhining Den Son V. Nguyen Mark Stoneking Prescott L. Deininger Mark A. Batzer Bronya J.B. Keats 《Journal of molecular evolution》2001,52(3):232-238
Friedreich ataxia is an autosomal recessive neurodegenerative disorder associated with a GAA repeat expansion in the first
intron of the gene (FRDA) encoding a novel, highly conserved, 210 amino acid protein known as frataxin. Normal variation in
repeat size was determined by analysis of more than 600 DNA samples from seven human populations. This analysis showed that
the most frequent allele had nine GAA repeats, and no alleles with fewer than five GAA repeats were found. The European and
Syrian populations had the highest percentage of alleles with 10 or more GAA repeats, while the Papua New Guinea population
did not have any alleles carrying more than 10 GAA repeats. The distributions of repeat sizes in the European, Syrian, and
African American populations were significantly different from those in the Asian and Papua New Guinea populations (p < 0.001). The GAA repeat size was also determined in five nonhuman primates. Samples from 10 chimpanzees, 3 orangutans, 1
gorilla, 1 rhesus macaque, 1 mangabey, and 1 tamarin were analyzed. Among those primates belonging to the Pongidae family,
the chimpanzees were found to carry three or four GAA repeats, the orangutans had four or five GAA repeats, and the gorilla
carried three GAA repeats. In primates belonging to the Cercopithecidae family, three GAA repeats were found in the mangabey
and two in the rhesus macaque. However, an AluY subfamily member inserted in the poly(A) tract preceding the GAA repeat region in the rhesus macaque, making the amplified
sequence approximately 300 bp longer. The GAA repeat was also found in the tamarin, suggesting that it arose at least 40 million
years ago and remained relatively small throughout the majority of primate evolution, with a punctuated expansion in the human
genome.
Received: 18 August 2000 / Accepted: 10 November 2000 相似文献
13.
Tominaga O Su ZH Kim CG Okamoto M Imura Y Osawa S 《Journal of molecular evolution》2000,50(6):541-549
Phylogenetic analyses based on the mitochondrial ND5 gene comparisons and the geohistory of the Japanese Islands suggest
that each Japanese species belonging to the subtribe Carabina has its own history for the establishment of its present habitat
in the Japanese Islands. It can be roughly classified into two categories: (1) species which were derived from the ancestry
that inhabited ancient Japan at the time of its split from the Eurasian Continent [ca. 15 million years ago (MYA)], followed
by diversification within the Japanese Islands; and (2) species which invaded Hokkaido from the Eurasian Continent through
land-bridges from Sakhalin and/or the Kuriles or from western Japan from the Korean Peninsula during the glacial era (<2 MYA).
Received: 28 September 1999 / Accepted: 25 February 2000 相似文献
14.
Matthew Bellgard David Schibeci Edward Trifonov Takashi Gojobori 《Journal of molecular evolution》2001,53(4-5):465-468
Identifying the G + C difference between closely related bacterial species or between different strains of the same species
is one of the first steps in understanding the evolutionary mechanisms accounting for the differences observed among bacterial
species. The G + C content can be one of the most important factors in the evolution of genomic structures. In this paper,
we describe a new method for detecting an initial stage of differentiation of the G + C content at the third codon base position
between two strains of the same bacterial species. We apply this method to the two strains of Helicobacter pylori. A group of genes is detected with large variations of G + C in the third positions—apparently genes of early response to
pressures of changing G + C. We discuss our findings from the viewpoint of genomic evolution.
Received: 26 February 2001 / Accepted: 16 May 2001 相似文献
15.
McClellan DA 《Journal of molecular evolution》2000,51(3):185-193
The codon-degeneracy model (CDM) predicts relative frequencies of substitution for any set of homologous protein-coding DNA
sequences based on patterns of nucleotide degeneracy, codon composition, and the assumption of selective neutrality. However,
at present, the CDM is reliant on outside estimates of transition bias. A new method by which the power of the CDM can be
used to find a synonymous transition bias that is optimal for any given phylogenetic tree topology is presented. An example
is illustrated that utilizes optimized transition biases to generate CDM GF-scores for every possible phylogenetic tree for
pocket gophers of the genus Orthogeomys. The resulting distribution of CDM GF-scores is compared and contrasted with the results of maximum parsimony and maximum
likelihood methods. Although convergence on a single tree topology by the CDM and another method indicates greater support
for that particular tree, the value of CDM GF-score as the sole optimality criterion for phylogeny reconstruction remains
to be determined. It is clear, however, that the a priori estimation of an optimum transition bias from codon composition
has a direct application to differentiating between alternative trees.
Received: 13 October 1999 / Accepted: 28 April 2000 相似文献
16.
Noboru Sueoka 《Journal of molecular evolution》1999,49(1):49-62
The relative contribution of mutation and selection to the G+C content of DNA was analyzed in bacterial species having widely
different G+C contents. The analysis used two methods that were developed previously. The first method was to plot the average
G+C content of a set of nucleotides against the G+C content of the third codon position for each gene. This method was used
to present the G+C distribution of the third codon position and to assess the relative neutrality of a set of nucleotides
to that of the G+C content of the third codon position. The second method was to plot the intrastrand bias of the third codon
position from Parity Rule 2 (PR2), where A=T and G=C. It was found that whereas intragenomic distributions of the DNA G+C content of these bacteria are narrow in the majority
of species, in some species the G+C content of the minor class of genes distributes over wider ranges than the major class
of genes. On the other hand, ubiquitous PR2 biases are amino acid specific and independent of the G+C content of DNA, so that
when averaged over the amino acids, the biases are small and not correlated with the DNA G+C content. Therefore, translation
coupled PR2-biases are unlikely to explain the wide range of G+C contents among different species. Considering all data available,
it was concluded that the amino acid-specific PR2 bias has only a minor effect, if any, on the average G+C content. In addition,
PR2 bias patterns of different species show phylogenetic relationships, and the pattern can be as a taxal fingerprint.
Received: 5 November 1998 / Accepted: 1 March 1999 相似文献
17.
An Evaluation of Measures of Synonymous Codon Usage Bias 总被引:14,自引:0,他引:14
Synonymous codons are not generally used at equal frequencies, and this trend is observed for most genes and organisms. Several
methods have been proposed and used to estimate the degree of the nonrandom use of the different synonymous codons. The estimates
obtained by these methods, however, show different levels of both precision and dispersion when coding regions of a finite
number of codons are under analysis. Here, we present a study, based on computer simulation, of how the different methods
proposed to evaluate the nonrandom use of synonymous codons are affected by the length of the coding region analyzed. The
results show that some of these methods are heavily influenced by the number of codons and that the comparison of codon usage
bias between coding regions of different lengths shows a methodological bias under different conditions of nonrandom use of
synonymous codons. The study of the dispersion of the estimates obtained by the different methods gives, on the other hand,
an indication of the methods to be applied to compare values of codon usage bias among coding regions of equivalent length.
Received: 10 September 1997 / Accepted: 23 March 1998 相似文献
18.
Berg OG 《Journal of molecular evolution》1999,48(4):398-407
The synonymous divergence between Escherichia coli and Salmonella typhimurium is explained in a model where there is a large variation between mutation rates at different nucleotide sites in the genome.
The model is based on the experimental observation that spontaneous mutation rates can vary over several orders of magnitude
at different sites in a gene. Such site-specific variation must be taken into account when studying synonymous divergence
and will result in an apparent saturation below the level expected from an assumption of uniform rates. Recently, it has been
suggested that codon preference in enterobacteria has a very large site-specific variation and that the synonymous divergence
between different species, e.g., E. coli and Salmonella, is saturated. In the present communication it is shown that when site-specific variation in mutation rates is introduced,
there is no need to invoke assumptions of saturation and a large variability in codon preference. The same rate variation
will also bring average mutation rates as estimated from synonymous sequence divergence into numerical agreement with experimental
values.
Received: 10 July 1998 / Accepted: 20 August 1998 相似文献
19.
The ubiquitous major intrinsic protein (MIP) family includes several transmembrane channel proteins known to exhibit specificity
for water and/or neutral solutes. We have identified 84 fully or partially sequenced members of this family, have multiply
aligned over 50 representative, divergent, fully sequenced members, have used the resultant multiple alignment to derive current
MIP family-specific signature sequences, and have constructed a phylogenetic tree. The tree reveals novel features relevant
to the evolutionary history of this protein family. These features plus an evaluation of functional studies lead to the postulates:
(i) that all current MIP family proteins derived from two divergent bacterial paralogues, one a glycerol facilitator, the
other an aquaporin, and (ii) that most or all current members of the family have retained these or closely related physiological
functions.
Received: 19 April 1996/Revised: 3 June 1996 相似文献
20.
Brian R. Morton Virginia M. Oberholzer Michael T. Clegg 《Journal of molecular evolution》1997,45(3):227-231
Substitutions occurring in noncoding sequences of the plant chloroplast genome violate the independence of sites that is
assumed by substitution models in molecular evolution. The probability that a substitution at a site is a transversion, as
opposed to a transition, increases significantly with increasing A + T content of the two adjacent nucleotides. In the present
study, this dependency of substitutions on local context is examined further in a number of noncoding regions from the chloroplast
genome of members of the grass family (Poaceae). Two features were examined; the influence of specific neighboring bases,
as opposed to the general A + T content, on transversion proportion and an influence on substitutions by nucleotides other
than the two immediately adjacent to the site of substitution. In both cases, a significant effect was found. In the case
of specific nucleotides, transversion proportion is significantly higher at sites with a pyrimidine immediately 5′ on either
strand. Substitutions at sites of the type YNR, where N is the site of substitution, have the highest rate of transversion.
This specific effect is secondary to the A + T content effect such that, in terms of proportion of substitutions that are
transversions, the nucleotides are ranked T > A > C > G as to their effect when they are immediately 5′ to the site of substitution.
In the case of nucleotides other than the immediate neighbors, a significant influence on substitution dynamics is observed
in the case where the two neighboring bases are both A and/or T. Thus, substitutions are primarily, but not exclusively, influenced
by the composition of the two nucleotides that are immediately adjacent. These results indicate that the pattern of molecular
evolution of the plant chloroplast genome is extremely complex as a result of a variety of inter-site dependencies.
Received: 18 October 1996 / Accepted: 12 April 1997 相似文献