首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Summary One hundred twelve human DNA sequences were analyzed with respect to dinucleotide frequency and amino acid composition. The variation in guanine and cytosine (G+C) content revealed: (1) at 2–3 and 3-1 doublet positions CG discrimination is attenuated at high G+C, but TA disfavor is enhanced, and (2) several amino acids are subject to G+C change. These findings have been reported in part for collections of sequences from various species. The present study confirms that in a single organism-the human-the G+C effects do exist. Aspects of the argument that connects G+C with protein thermal stability are also discussed.  相似文献   

2.
Summary The G+C content of DNA varies widely in different organisms, especially microorganisms. This variation is accompanied by changes in the nucleotide composition of silent positions in codons. (Silent positions are defined and explained in the text.) These changes are mostly neutral or near neutral, and appear to result from mutation pressure in the direction of increasing either A+T (AT pressure) or G+C(GC pressure) content. Variations in G+C content are also accompanied by substitutions at replacement positions in codons. These substituions produce changes in the amino acid content of homologous proteins. The examples studied were genes for 13 mitochondrial proteins in five species, and A and B genes for bacterial tryptophan synthase in four species.In microorganisms, varying AT and GC mutational pressures, presumably resulting from shifts in the DNA polymerase system, exert strong effects on molecular evolution by changing the G+C content of DNA. These effects may be greater than those of random drift. The effects of GC pressure on silent substitutions in the systems examined are several times as great as the effects on replacement substitutions.GC pressure is exerted on noncoding as well as coding regions in mitochondrial DNA. This is shown by the close correlation (correlation coefficient, 0.99) of the G+C content of the noncoding D loop of mitochondria with the G+C content of silent positions in the corresponding mitochondrial genes.  相似文献   

3.
Summary We present a phenomenological theory expressing the constraints operating on the (G+C) contents of the three codon positions, i.e., first, second, and third bases of codons, by using the smallest number of constraint parameters having clear physical and genetic meaning. Theoretical curves displaying base composition at each of the three codon sites are given. The agreement between the theoretical curves and the data points of 1277 genes is quite good irrespective of the species from which the DNAs originated; the curves might be universal ones and the constraint parameters might have general biological meanings in relation to the DNA/RNA and protein functions.  相似文献   

4.
Summary The genomes of human viruses herpes simplex 1 (HSV1) and varicella zoster (VZV), although similar in biology, largely concordant in gene order, and identical in many amino acid segments, differ widely in their genomic G+C (abbreviated S) content, which is high in HSV1 (68%) and low in VZV (46%). This paper analyzes several striking codon usage contrasts. The S difference in coding regions is dramatically large in codon site 3, S3, about 42%. The large difference in S3 is maintained at the same level in a subset of closely similar genes and even in corresponding identical amino acid blocks. A similar difference in S levels in silent site 1 (S1) is found in leucine and arginine. The difference in S3 levels occurs in every gene and in every multicodon amino acid form. The S difference also exists in amino acid usage, with HSV1 using significantly more codon types SSN, while VZV uses more codon types WWN (where W stands for A or T). The nonoverlapping and narrow histograms of S3 gene frequencies in both viruses suggest that the difference has arisen and been maintained by a process of selective rather than nonselective effects. This is in sharp contrast to the relatively large variance seen for highly similar genes in the human versus yeast analysis. Interpretations and hypotheses to explain the HSV1 vs VZV condon usage disparity relate to virus-host interactions, to the role of viral genes in DNA metabolism, to availability of molecular resources (molecular Gause exclusion principle), and to differences in genomic structure.  相似文献   

5.
Summary We have analyzed the correlation that exists between the GC levels of third and first or second codon position for about 1400 human coding sequences. The linear relationship that was found indicates that the large differences in GC level of third codon positions of human genes are paralleled by smaller differences in GC levels of first and second codon positions. Whereas third codon position differences correspond to very large differences in codon usage within the human genome, the first and second codon position differences correspond to smaller, yet very remarkable, differences in the amino acid composition of encoded proteins. Because GC levels of codon positions are linearly correlated with the GC levels of the isochores harboring the corresponding genes, both codon usage and amino acid composition are different for proteins encoded by genes located in isochores of different GC levels. Furthermore, we have also shown that a linear relationship with a unity slope and a correlation coefficient of 0.77 exists between GC levels of introns and exons from the 238 human genes currently available for this analysis. Introns are, however, about 5% lower in GC, on average, than exons from the same genes.  相似文献   

6.
We conducted a genome-wide analysis of variations in guanine plus cytosine (G+C) content at the third codon position at silent substitution sites of orthologous human and mouse protein-coding nucleotide sequences. Alignments of 3776 human protein-coding DNA sequences with mouse orthologs having >50 synonymous codons were analyzed, and nucleotide substitutions were counted by comparing sequences in the alignments extracted from gap-free regions. The G+C content at silent sites in these pairs of genes showed a strong negative correlation (r = -0.93). Some gene pairs showed significant differences in G+C content at the third codon position at silent substitution sites. For example, human thymine-DNA glycosylase was A+T-rich at the silent substitution sites, while the orthologous mouse sequence was G+C-rich at the corresponding sites. In contrast, human matrix metalloproteinase 23B was G+C-rich at silent substitution sites, while the mouse ortholog was A+T-rich. We discuss possible implications of this significant negative correlation of G+C content at silent sites.  相似文献   

7.
G+C3 structuring along the genome: a common feature in prokaryotes   总被引:1,自引:0,他引:1  
The heterogeneity of gene nucleotide content in prokaryotic genomes is commonly interpreted as the result of three main phenomena: (1) genes undergo different selection pressures both during and after translation (affecting codon and amino acid choice); (2) genes undergo different mutational pressure whether they are on the leading or lagging strand; and (3) genes may have different phylogenetic origins as a result of lateral transfers. However, this view neglects the necessity of organizing genetic information on a chromosome that needs to be replicated and folded, which may add constraints to single gene evolution. As a consequence, genes are potentially subjected to different mutation and selection pressures, depending on their position in the genome. In this paper, we analyze the structuring of different codon usage measures along completely sequenced bacterial genomes. We show that most of them are highly structured, suggesting that genes have different base content, depending on their location on the chromosome. A peculiar pattern of genome structure, with a tendency toward an A+T-enrichment near the replication terminus, is found in most bacterial phyla and may reflect common chromosome constraints. Several species may have lost this pattern, probably because of genome rearrangements or integration of foreign DNA. We show that in several species, this enrichment is associated with an increase of evolutionary rate and we discuss the evolutionary implications of these results. We argue that structural constraints acting on the circular chromosome are not negligible and that this natural structuring of bacterial genomes may be a cause of overestimation in lateral gene transfer predictions using codon composition indices.  相似文献   

8.
9.
10.
We studied the correlations between amino acid composition and mononucleotide and dinucleotide frequencies in 115 bacterial genomes of varying G+C content. Observed amino acid frequencies were compared with those expected from the actual mononucleotide and dinucleotide frequencies. Both mononucleotide and dinucleotide frequencies correlate well with the amino acid frequency, with dinucleotide frequencies doing so better. Despite the strong correlations, some of the observed amino acid frequencies, in particular for Arg, Val, Asp, Glu, Ser, and Cys, were consistently different from predicted values in all genomes. We suggest that this variation from predicted values is a consequence of selection pressure at the level of amino acids, while the close correspondence to the predictions in residues such as Thr, Phe, Lys, and Asn arises only from mutation and selection pressure at the level of the nucleic acid sequences.  相似文献   

11.
Abstract: Intact neurofilaments were isolated from bovine spinal cord white matter, washed by sedimentation in 0.1 m -NaCl, and extracted with 8 m -urea. Solubilized neurofilament triplet proteins of molecular weights approximately 68,000 (P68), 150,000 (P150), and 200,000 (P200) were purified by preparative electrophoresis, using an LKB 7900 Uniphor apparatus. The method provides for an enhanced yield of purified protein and has markedly reduced admixture of electrophoresed protein with acrylamide and associated protein contaminants. Amino acid compositions of the purified neurofilament triplet proteins are reported and compared.  相似文献   

12.
The amino acid compositions of proteins from halophilic archaea were compared with those from non-halophilic mesophiles and thermophiles, in terms of the protein surface and interior, on a genome-wide scale. As we previously reported for proteins from thermophiles, a biased amino acid composition also exists in halophiles, in which an abundance of acidic residues was found on the protein surface as compared to the interior. This general feature did not seem to depend on the individual protein structures, but was applicable to all proteins encoded within the entire genome. Unique protein surface compositions are common in both halophiles and thermophiles. Statistical tests have shown that significant surface compositional differences exist among halophiles, non-halophiles, and thermophiles, while the interior composition within each of the three types of organisms does not significantly differ. Although thermophilic proteins have an almost equal abundance of both acidic and basic residues, a large excess of acidic residues in halophilic proteins seems to be compensated by fewer basic residues. Aspartic acid, lysine, asparagine, alanine, and threonine significantly contributed to the compositional differences of halophiles from meso- and thermophiles. Among them, however, only aspartic acid deviated largely from the expected amount estimated from the dinucleotide composition of the genomic DNA sequence of the halophile, which has an extremely high G+C content (68%). Thus, the other residues with large deviations (Lys, Ala, etc.) from their non-halophilic frequencies could have arisen merely as "dragging effects" caused by the compositional shift of the DNA, which would have changed to increase principally the fraction of aspartic acid alone.  相似文献   

13.
Mammalian gene evolution: Nucleotide sequence divergence between mouse and rat   总被引:16,自引:0,他引:16  
As a paradigm of mammalian gene evolution, the nature and extent of DNA sequence divergence between homologous protein-coding genes from mouse and rat have been investigated. The data set examined includes 363 genes totalling 411 kilobases, making this by far the largest comparison conducted between a single pair of species. Mouse and rat genes are on average 93.4% identical in nucleotide sequence and 93.9% identical in amino acid sequence. Individual genes vary substantially in the extent of nonsynonymous nucleotide substitution, as expected from protein evolution studies; here the variation is characterized. The extent of synonymous (or silent) substitution also varies considerably among genes, though the coefficient of variation is about four times smaller than for nonsynonymous substitutions. A small number of genes mapped to the X-chromosome have a slower rate of molecular evolution than average, as predicted if molecular evolution is male-driven. Base composition at silent sites varies from 33% to 95% G + C in different genes; mouse and rat homologues differ on average by only 1.7% in silent-site G + C, but it is shown that this is not necessarily due to any selective constraint on their base composition. Synonymous substitution rates and silent site base composition appear to be related (genes at intermediate G + C have on average higher rates), but the relationship is not as strong as in our earlier analyses. Rates of synonymous and nonsynonymous substitution are correlated, apparently because of an excess of substitutions involving adjacent pairs of nucleotides. Several factors suggest that synonymous codon usage in rodent genes is not subject to selection.  相似文献   

14.
Summary. Previous studies showed that the cellular amino acid composition obtained by amino acid analysis of whole cells, differs such as eubacteria, protozoa, fungi and mammalian cells. These results suggest that the difference in the cellular amino acid composition reflects biological changes as the result of evolution. However, the basic pattern of cellular amino acid composition was relatively constant in all organisms examined. In the present study, we examined archaeobacteria, because they are considered important in understanding the relationship between biological evolution and cellular amino acid composition. The cellular amino acid compositions of Archaeoglobus fulgidus, Pyrococcus horikoshii, Methanobacterium thermoautotrophicum and Methanococcus jannaschii differed slightly from each other, but were similar to those determined from codon usage data, based on the complete genomes. Thus, the cellular amino acid composition reflects biological evolution. We suggest that primitive forms of life appearing on earth at the end of prebiotic evolution had a similar-cellular amino acid composition. Received November 28, 2000 Accepted January 30, 2001  相似文献   

15.
Zhou XX  Wang YB  Pan YJ  Li WF 《Amino acids》2008,34(1):25-33
Summary. Thermophilic proteins show substantially higher intrinsic thermal stability than their mesophilic counterparts. Amino acid composition is believed to alter the intrinsic stability of proteins. Several investigations and mutagenesis experiment have been carried out to understand the amino acid composition for the thermostability of proteins. This review presents some generalized features of amino acid composition found in thermophilic proteins, including an increase in residue hydrophobicity, a decrease in uncharged polar residues, an increase in charged residues, an increase in aromatic residues, certain amino acid coupling patterns and amino acid preferences for thermophilic proteins. The differences of amino acids composition between thermophilic and mesophilic proteins are related to some properties of amino acids. These features provide guidelines for engineering mesophilic protein to thermophilic protein. Authors’ addresses: Yuan-Jiang Pan, Institute of Chemical Biology and Pharmaceutical Chemistry, Zhejiang University, Zhejiang University Road 38, Hangzhou 310027, China; Wei-Fen Li, Microbiology Division, College of Animal Science, Zhejiang University, Hangzhou 310029, China  相似文献   

16.
Summary Several forms of maximum likelihood models are applied to aligned amino acid sequence data coded for in the mitochondrial DNA of six species (chicken, frog, human, bovine, mouse, and rat). These models range in form from relatively simple models of the type currently used for inferring phylogenetic tree structure to models more complex than those that have been used previously. No major discrepancies between the optimal trees inferred by any of these methods are found, but there are huge differences in adequacy of fit. A very significant finding is that the fit of any of these models is vastly improved by allowing a certain proportion of the amino acid sites to be invariant. An even more important, although disquieting, finding is that none of these models fits well, as judged by standard statistical criteria. The primary reason for this is that amino acid sites undergo substitution according to a process that is very heterogeneous. Because most phylogenetic inference is accomplished by choosing the optimal tree under the assumption that a homogeneous process is acting on the sites, the potential invalidity of some such conclusions is raised by this article's results. The seriousness of this problem depends upon the robustness of the phylogenetic inferential procedure to departures from the underlying model.  相似文献   

17.
FramePlot is a web-based tool for predicting protein-coding regions in bacterial DNA with a high G + C content, such as Streptomyces. The graphical output provides for easy distinction of protein-coding regions from non-coding regions. The plot is a clickable map. Clicking on an ORF provides not only the nucleotide sequence but also its deduced amino acid sequence. These sequences can then be compared to the NCBI sequence database over the Internet. The program is freely available for academic purposes at http://www.nih.go.jp/jun/cgi-bin/frameplot.pl.  相似文献   

18.
Many polar fishes synthesize a group of eight glycopeptides that exhibit a non-colligative lowering of the freezing point of water. These glycopeptides range in molecular weight between 2600 and 33700. The largest glycopeptides [1–5] lower the freezing point more than the small ones on a weight basis and contain only two amino acids, alanine and threonine, with the disaccharide galactose-N-acetyl-galactosamine attached to threonine. The smaller glycopeptides, 6, 7, and 8, also lower the freezing point and contain proline, which periodically substitutes for alanine. Glycopeptides with similar antifreeze properties isolated from the saffron cod and the Atlantic tomcod contain an additional amino acid, arginine, which substitutes for threonine in glycopeptide 6. In this study we address the question of whether differences in amino acid composition or molecular weight between large and small glycopeptides are responsible for the reduced freezing point depressing capability of the low molecular weight glycopeptides. The results indicate that the degree of amino acid substitutions that occur in glycopeptides 6–8 do not have a significant effect on the unusual freezing point lowering and that the observed decrease in freezing point depression with smaller glycopeptides can be accounted for on the basis of molecular weight.  相似文献   

19.
G:C pairs are more stable than A:T pairs because they have an additional hydrogen bond. This has led to many studies on the correlation between the guanine+cytosine (G+C) content of nucleic acids and temperature over the last 20 years. We collected the optimal growth temperatures (Topt) and the G+C contents of genomic DNA; 23S, 16S, and 5S ribosomal RNAs; and transfer RNAs for 764 prokaryotic species. No correlation was found between genomic G+C content and Topt, but there were striking correlations between the G+C content of ribosomal and transfer RNA stems and Topt. Two explanations have been proposed—neutral evolution and selection pressure—for the approximate equalities of G and C (respectively, A and T) contents within each strand of DNA molecules. Our results do not support the notion that selection pressure induces complementary oligonucleotides in close proximity and therefore numerous secondary structures in prokaryotic DNA, as the genomic G+C content does not behave in the same way as that of folded RNA with respect to optimal growth temperature. Received: 25 September 1996 / Accepted: 21 January 1997  相似文献   

20.
In pairwise comparisons of gene frequency data from the three major races of man, the single locus measures of the heterozygosity within and the genetic distance between races are shown to be strongly correlated across the loci coding for red cell proteins and enzymes. The intercept of the regression line of genetic distance on heterozygosity in protein enzyme loci is statistically insignificant. These findings suggest that the genetic variability at the enzyme and protein loci in man is probably maintained by a balance of mutation and random genetic drift. At the blood group loci, however, the observed relationship between genetic distance and heterozygosity does not follow the expectation of the neutral mutation hypothesis. These observations are discussed in terms of the changes in probability of identical monomorphism in two populations during the process of gene differentiation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号