首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
We show that in animal mitochondria homologous genes that differ in guanine plus cytosine (G + C) content code for proteins differing in amino acid content in a manner that relates to the G + C content of the codons. DNA sequences were analyzed using square plots, a new method that combines graphical visualization and statistical analysis of compositional differences in both DNA and protein. Square plots divide codons into four groups based on first and second position A + T (adenine plus thymine) and G + C content and indicate differences in amino acid content when comparing sequences that differ in G + C content. When sequences are compared using these plots, the amino acid content is shown to correlate with the nucleotide bias of the genes. This amino acid effect is shown in all protein-coding genes in the mitochondrial genome, including cox I, cox II, and cyt b, mitochondrial genes which are commonly used for phylogenetic studies. Furthermore, nucleotide content differences are shown to affect the content of all amino acids with A + T- and G + C-rich codons. We speculate that phylogenetic analysis of genes so affected may tend erroneously to indicate relatedness (or lack thereof) based only on amino acid content. Received: 3 July 1996 / Accepted: 6 November 1996  相似文献   

2.
Genes of a multicellular organism are heterogeneous in the G+C content, which is particularly true in the third codon position. The extent of deviation from intra-strand equality rule of A = T and G = C (Parity Rule 2, or PR2) is specific for individual amino acids and has been expressed as the PR2-bias fingerprint. Previous results suggested that the PR2-bias fingerprints tend to be similar among the genes of an organism, and the fingerprint of the organism is specific for different taxa, reflecting phylogenetic relationships of organisms. In this study, using coding sequences of a large number of human genes, we examined the intragenomic heterogeneity of their PR2-bias fingerprints in relation to the G+C content of the third codon position (P 3 ). Result shows that the PR2-bias fingerprint is similar in the wide range of the G+C content at the third codon position (0.30–0.80). This range covers approximately 89% of the genes, and further analysis of the high G+C range (0.80–1.00), where genes with normal PR2-bias fingerprints and those with anomalous fingerprints are mixed, shows that the total of 95% of genes have the similar finger prints. The result indicates that the PR2-bias fingerprint is a unique property of an organism and represents the overall characteristics of the genome. Combined with the previous results that the evolutionary change of the PR2-bias fingerprint is a slow process, PR2-bias fingerprints may be used for the phylogenetic analyses to supplement and augment the conventional methods that use the differences of the sequences of orthologous proteins and nucleic acids. Potential advantages and disadvantages of the PR2-bias fingerprint analysis are discussed. Received: 21 December 2000 / Accepted: 16 February 2001  相似文献   

3.
4.
The extent to which base composition and codon usage vary among RNA viruses, and the possible causes of this bias, is undetermined in most cases. A maximum-likelihood statistical method was used to test whether base composition and codon usage bias covary with arthropod association in the genus Flavivirus, a major source of disease in humans and animals. Flaviviruses are transmitted by mosquitoes, by ticks, or directly between vertebrate hosts. Those viruses associated with ticks were found to have a significantly lower G+C content than non-vector-borne flaviviruses and this difference was present throughout the genome at all amino acids and codon positions. In contrast, mosquito-borne viruses had an intermediate G+C content which was not significantly different from those of the other two groups. In addition, biases in dinucleotide and codon usage that were independent of base composition were detected in all flaviviruses, but these did not covary with arthropod association. However, the overall effect of these biases was slight, suggesting only weak selection at synonymous sites. A preliminary analysis of base composition, codon usage, and vector specificity in other RNA virus families also revealed a possible association between base composition and vector specificity, although with biases different from those seen in the Flavivirus genus. Received: 29 August 2000 / Accepted: 19 December 2000  相似文献   

5.
A survey of the patterns of synonymous codon preference in the HIV env gene reveals a correlation between the codon bias and the mutability requirements of different regions of the protein. At hypervariable regions in gp120 one finds a greater proportion of codons that tend to mutate nonsynonymously, but to a target that is similar in hydrophobicity and volume. We argue that this strategy results from a compromise between the selective pressure placed on the virus by the induced immune response, which favors amino acid substitutions in the complementarity determining regions, and the negative selection against missense mutations that violate structural constraints of the env protein. Received: 9 June 1997 / Accepted: 25 May 1998  相似文献   

6.
The primary and secondary structure of the small-subunit ribosomal RNA (ssrRNA) gene from the naked, marine amoeba, Vannella anglica (subclass Gymnamoebia), was determined. The ssrRNA is 1962 nucleotides in length, with a low G+C content of 37.1%. The ssrRNA is composed of several uncommon secondary structure features including helix E8-1, which may be a useful target for rRNA probes for the direct identification of isolates in mixed culture. Phylogenetic analysis of sequence data showed that V. anglica branched prior to the rapid diversification of the eukaryotes. It did not associate with the other naked, lobose amoebae represented by Acanthamoeba and Hartmannella, indicating that Vannella represents a separate amoeboid lineage and the subclass Gymnamoebia is polyphyletic. Received: 9 July 1998 / Accepted: 16 November 1998  相似文献   

7.
We previously found that proteinaceous protease inhibitors homologous to Streptomyces subtilisin inhibitor (SSI) are widely produced by various Streptomyces species, and we designated them ``SSI-like proteins' (Taguchi S, Kikuchi H, Suzuki M, Kojima S, Terabe M, Miura K, Nakase T, Momose H [1993] Appl Environ Microbiol 59:4338–4341). In this study, SSI-like proteins from five strains of the genus Streptoverticillium were purified and sequenced, and molecular phylogenetic trees were constructed on the basis of the determined amino acid sequences together with those determined previously for Streptomyces species. The phylogenetic trees showed that SSI-like proteins from Streptoverticillium species are phylogenetically included in Streptomyces SSI-like proteins but form a monophyletic group as a distinct lineage within the Streptomyces proteins. This provides an alternative phylogenetic framework to the previous one based on partial small ribosomal RNA sequences, and it may indicate that the phylogenetic affiliation of the genus Streptoverticillium should be revised. The phylogenetic trees also suggested that SSI-like proteins possessing arginine or methionine at the P1 site, the major reactive center site toward target proteases, arose multiple times on independent lineages from ancestral proteins possessing lysine at the P1 site. Most of the codon changes at the P1 site inferred to have occurred during the evolution of SSI-like proteins are consistent with those inferred from the extremely high G + C content of Streptomyces genomes. The inferred minimum number of amino acid replacements at the P1 site was nearly equal to the average number for all the variable sites. It thus appears that positive Darwinian selection, which has been postulated to account for accelerated rates of amino acid replacement at the major reaction center site of mammalian protease inhibitors, may not have dictated the evolution of the bacterial SSI-like proteins. Received: 23 August 1996 / Accepted: 20 November 1996  相似文献   

8.
Base composition is not uniform across the genome of Drosophila melanogaster. Earlier analyses have suggested that there is variation in composition in D. melanogaster on both a large scale and a much smaller, within-gene, scale. Here we present analyses on 117 genes which have reliable intron/exon boundaries and no known alternative splicing. We detect significant heterogeneity in G+C content among intron segments from the same gene, as well as a significant positive correlation between the intron and the third codon position G+C content within genes. Both of these observations appear to be due, in part, to an overall decline in intron and third codon position G+C content along Drosophila genes with introns. However, there is also evidence of an increase in third codon position G+C content at the start of genes; this is particularly evident in genes without introns. This is consistent with selection acting against preferred codons at the start of genes. Received: 24 February 1997 / Accepted: 10 November 1997  相似文献   

9.
Microsatellite Evolution: Testing the Ascertainment Bias Hypothesis   总被引:5,自引:0,他引:5  
Previous studies suggest the median allele length of microsatellites is longest in the species from which the markers were derived, suggesting that an ascertainment bias was operating. We have examined whether the size distribution of microsatellite alleles between sheep and cattle is source dependent using a set of 472 microsatellites that can be amplified in both species. For those markers that were polymorphic in both species we report a significantly greater number of markers (P < 0.001) with longer median allele sizes in sheep, regardless of microsatellite origin. This finding suggests that any ascertainment bias operating during microsatellite selection is only a minor contributor to the variation observed. Received: 6 January 1997 / Accepted: 19 May 1997  相似文献   

10.
Unlike birds and mammals, teleost fish express two paralogous isoforms (paralogues) of cytosolic malate dehydrogenase (cMDH; EC 1.1.1.37; NAD+: malate oxidoreductase) whose evolutionary relationships to the single cMDH of tetrapods are unknown. We sequenced complementary DNAs for both cMDHs and the mitochondrial isoform (mMDH) of the fish Sphyraena idiastes (south temperate barracuda) and compared the sequences, kinetic properties, and thermal stabilities of the three isoforms with those of mammalian orthologues. Both fish cMDHs comprise 333 residues and have subunit masses of approximately 36 kDa. One cytosolic isoform, cMDH-S, was significantly more heat-stable than either the other cMDH (cMDH-L) or mMDH. In contradiction to the generally accepted model of vertebrate cMDH evolution, our phylogenetic analysis indicates that the duplication of the fish cytosolic paralogues occurred after the divergence of the lineages leading to teleosts and tetrapods. cMDH-L and cMDH-S differed in optimal concentrations of substrates and cofactors and apparent Michaelis–Menten constants, suggesting that the two paralogues may play distinct physiological roles. Differences in intrinsic thermal stability among MDH paralogues may reflect different degrees of stabilization in vivo by extrinsic stabilizers, notably protein concentration in the case of mMDH. Thermal stabilities of porcine mMDH and cMDH-L, but not cMDH-S, were significantly increased when denaturation was measured at a high protein (bovine serum albumin; BSA) concentration, but the BSA-induced stabilization reduced the catalytic activity. Received: 5 April 2001 / Accepted: 28 June 2001  相似文献   

11.
Friedreich ataxia is an autosomal recessive neurodegenerative disorder associated with a GAA repeat expansion in the first intron of the gene (FRDA) encoding a novel, highly conserved, 210 amino acid protein known as frataxin. Normal variation in repeat size was determined by analysis of more than 600 DNA samples from seven human populations. This analysis showed that the most frequent allele had nine GAA repeats, and no alleles with fewer than five GAA repeats were found. The European and Syrian populations had the highest percentage of alleles with 10 or more GAA repeats, while the Papua New Guinea population did not have any alleles carrying more than 10 GAA repeats. The distributions of repeat sizes in the European, Syrian, and African American populations were significantly different from those in the Asian and Papua New Guinea populations (p < 0.001). The GAA repeat size was also determined in five nonhuman primates. Samples from 10 chimpanzees, 3 orangutans, 1 gorilla, 1 rhesus macaque, 1 mangabey, and 1 tamarin were analyzed. Among those primates belonging to the Pongidae family, the chimpanzees were found to carry three or four GAA repeats, the orangutans had four or five GAA repeats, and the gorilla carried three GAA repeats. In primates belonging to the Cercopithecidae family, three GAA repeats were found in the mangabey and two in the rhesus macaque. However, an AluY subfamily member inserted in the poly(A) tract preceding the GAA repeat region in the rhesus macaque, making the amplified sequence approximately 300 bp longer. The GAA repeat was also found in the tamarin, suggesting that it arose at least 40 million years ago and remained relatively small throughout the majority of primate evolution, with a punctuated expansion in the human genome. Received: 18 August 2000 / Accepted: 10 November 2000  相似文献   

12.
The mitochondrial DNA-encoded cytochrome oxidase subunit I (COI) gene and the nuclear DNA-encoded hsp60 gene from the euglenoid protozoan Euglena gracilis were cloned and sequenced. The COI sequence represents the first example of a mitochondrial genome-encoded gene from this organism. This gene contains seven TGG tryptophan codons and no TGA tryptophan codons, suggesting the use of the universal genetic code. This differs from the situation in the mitochondrion of the related kinetoplastid protozoa, in which TGA codes for tryptophan. In addition, a complete absence of CGN triplets may imply the lack of the corresponding tRNA species. COI cDNAs from E. gracilis possess short 5′ and 3′ untranslated transcribed sequences and lack a 3′ poly[A] tail. The COI gene does not require uridine insertion/deletion RNA editing, as occurs in kinetoplastid mitochondria, to be functional, and no short guide RNA-like molecules could be visualized by labeling total mitochondrial RNA with [α-32P]GTP and guanylyl transferase. In spite of the differences in codon usage and the 3′ end structures of mRNAs, phylogenetic analysis using the COI and hsp60 protein sequences suggests a monophyletic relationship between the mitochondrial genomes of E. gracilis and of the kinetoplastids, which is consistent with the phylogenetic relationship of these groups previously obtained using nuclear ribosomal RNA sequences. Received: 5 March 1996 / Accepted: 31 July 1996  相似文献   

13.
Phylogenetic analyses based on the mitochondrial ND5 gene comparisons and the geohistory of the Japanese Islands suggest that each Japanese species belonging to the subtribe Carabina has its own history for the establishment of its present habitat in the Japanese Islands. It can be roughly classified into two categories: (1) species which were derived from the ancestry that inhabited ancient Japan at the time of its split from the Eurasian Continent [ca. 15 million years ago (MYA)], followed by diversification within the Japanese Islands; and (2) species which invaded Hokkaido from the Eurasian Continent through land-bridges from Sakhalin and/or the Kuriles or from western Japan from the Korean Peninsula during the glacial era (<2 MYA). Received: 28 September 1999 / Accepted: 25 February 2000  相似文献   

14.
Identifying the G + C difference between closely related bacterial species or between different strains of the same species is one of the first steps in understanding the evolutionary mechanisms accounting for the differences observed among bacterial species. The G + C content can be one of the most important factors in the evolution of genomic structures. In this paper, we describe a new method for detecting an initial stage of differentiation of the G + C content at the third codon base position between two strains of the same bacterial species. We apply this method to the two strains of Helicobacter pylori. A group of genes is detected with large variations of G + C in the third positions—apparently genes of early response to pressures of changing G + C. We discuss our findings from the viewpoint of genomic evolution. Received: 26 February 2001 / Accepted: 16 May 2001  相似文献   

15.
The codon-degeneracy model (CDM) predicts relative frequencies of substitution for any set of homologous protein-coding DNA sequences based on patterns of nucleotide degeneracy, codon composition, and the assumption of selective neutrality. However, at present, the CDM is reliant on outside estimates of transition bias. A new method by which the power of the CDM can be used to find a synonymous transition bias that is optimal for any given phylogenetic tree topology is presented. An example is illustrated that utilizes optimized transition biases to generate CDM GF-scores for every possible phylogenetic tree for pocket gophers of the genus Orthogeomys. The resulting distribution of CDM GF-scores is compared and contrasted with the results of maximum parsimony and maximum likelihood methods. Although convergence on a single tree topology by the CDM and another method indicates greater support for that particular tree, the value of CDM GF-score as the sole optimality criterion for phylogeny reconstruction remains to be determined. It is clear, however, that the a priori estimation of an optimum transition bias from codon composition has a direct application to differentiating between alternative trees. Received: 13 October 1999 / Accepted: 28 April 2000  相似文献   

16.
The relative contribution of mutation and selection to the G+C content of DNA was analyzed in bacterial species having widely different G+C contents. The analysis used two methods that were developed previously. The first method was to plot the average G+C content of a set of nucleotides against the G+C content of the third codon position for each gene. This method was used to present the G+C distribution of the third codon position and to assess the relative neutrality of a set of nucleotides to that of the G+C content of the third codon position. The second method was to plot the intrastrand bias of the third codon position from Parity Rule 2 (PR2), where A=T and G=C. It was found that whereas intragenomic distributions of the DNA G+C content of these bacteria are narrow in the majority of species, in some species the G+C content of the minor class of genes distributes over wider ranges than the major class of genes. On the other hand, ubiquitous PR2 biases are amino acid specific and independent of the G+C content of DNA, so that when averaged over the amino acids, the biases are small and not correlated with the DNA G+C content. Therefore, translation coupled PR2-biases are unlikely to explain the wide range of G+C contents among different species. Considering all data available, it was concluded that the amino acid-specific PR2 bias has only a minor effect, if any, on the average G+C content. In addition, PR2 bias patterns of different species show phylogenetic relationships, and the pattern can be as a taxal fingerprint. Received: 5 November 1998 / Accepted: 1 March 1999  相似文献   

17.
An Evaluation of Measures of Synonymous Codon Usage Bias   总被引:14,自引:0,他引:14  
Synonymous codons are not generally used at equal frequencies, and this trend is observed for most genes and organisms. Several methods have been proposed and used to estimate the degree of the nonrandom use of the different synonymous codons. The estimates obtained by these methods, however, show different levels of both precision and dispersion when coding regions of a finite number of codons are under analysis. Here, we present a study, based on computer simulation, of how the different methods proposed to evaluate the nonrandom use of synonymous codons are affected by the length of the coding region analyzed. The results show that some of these methods are heavily influenced by the number of codons and that the comparison of codon usage bias between coding regions of different lengths shows a methodological bias under different conditions of nonrandom use of synonymous codons. The study of the dispersion of the estimates obtained by the different methods gives, on the other hand, an indication of the methods to be applied to compare values of codon usage bias among coding regions of equivalent length. Received: 10 September 1997 / Accepted: 23 March 1998  相似文献   

18.
The synonymous divergence between Escherichia coli and Salmonella typhimurium is explained in a model where there is a large variation between mutation rates at different nucleotide sites in the genome. The model is based on the experimental observation that spontaneous mutation rates can vary over several orders of magnitude at different sites in a gene. Such site-specific variation must be taken into account when studying synonymous divergence and will result in an apparent saturation below the level expected from an assumption of uniform rates. Recently, it has been suggested that codon preference in enterobacteria has a very large site-specific variation and that the synonymous divergence between different species, e.g., E. coli and Salmonella, is saturated. In the present communication it is shown that when site-specific variation in mutation rates is introduced, there is no need to invoke assumptions of saturation and a large variability in codon preference. The same rate variation will also bring average mutation rates as estimated from synonymous sequence divergence into numerical agreement with experimental values. Received: 10 July 1998 / Accepted: 20 August 1998  相似文献   

19.
Substitutions occurring in noncoding sequences of the plant chloroplast genome violate the independence of sites that is assumed by substitution models in molecular evolution. The probability that a substitution at a site is a transversion, as opposed to a transition, increases significantly with increasing A + T content of the two adjacent nucleotides. In the present study, this dependency of substitutions on local context is examined further in a number of noncoding regions from the chloroplast genome of members of the grass family (Poaceae). Two features were examined; the influence of specific neighboring bases, as opposed to the general A + T content, on transversion proportion and an influence on substitutions by nucleotides other than the two immediately adjacent to the site of substitution. In both cases, a significant effect was found. In the case of specific nucleotides, transversion proportion is significantly higher at sites with a pyrimidine immediately 5′ on either strand. Substitutions at sites of the type YNR, where N is the site of substitution, have the highest rate of transversion. This specific effect is secondary to the A + T content effect such that, in terms of proportion of substitutions that are transversions, the nucleotides are ranked T > A > C > G as to their effect when they are immediately 5′ to the site of substitution. In the case of nucleotides other than the immediate neighbors, a significant influence on substitution dynamics is observed in the case where the two neighboring bases are both A and/or T. Thus, substitutions are primarily, but not exclusively, influenced by the composition of the two nucleotides that are immediately adjacent. These results indicate that the pattern of molecular evolution of the plant chloroplast genome is extremely complex as a result of a variety of inter-site dependencies. Received: 18 October 1996 / Accepted: 12 April 1997  相似文献   

20.
The ubiquitous major intrinsic protein (MIP) family includes several transmembrane channel proteins known to exhibit specificity for water and/or neutral solutes. We have identified 84 fully or partially sequenced members of this family, have multiply aligned over 50 representative, divergent, fully sequenced members, have used the resultant multiple alignment to derive current MIP family-specific signature sequences, and have constructed a phylogenetic tree. The tree reveals novel features relevant to the evolutionary history of this protein family. These features plus an evaluation of functional studies lead to the postulates: (i) that all current MIP family proteins derived from two divergent bacterial paralogues, one a glycerol facilitator, the other an aquaporin, and (ii) that most or all current members of the family have retained these or closely related physiological functions. Received: 19 April 1996/Revised: 3 June 1996  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号