首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The compositional properties of human genes   总被引:8,自引:0,他引:8  
Summary The present work represents the first attempt to study in greater detail previously proposed compositional correlations in genomes, based on a body of additional data relating to gene localizations as well as to extended flanking sequences extracted from gene banks. We have investigated the correlations that exist between (1) the GC levels of exons of human genes, and (2) the GC levels of either intergenic sequences or introns associated with the genes under consideration. In both cases, linear relationships with slopes close to unity were found. The similarity of the linear relationships indicates similar GC levels in intergenic sequences and introns located in the same isochores. Moreover, both intergenic sequences and introns showed GC levels 5–10% lower than the corresponding exons. The above findings considerably strengthen the previously drawn conclusion that coding and noncoding sequences (both inter- and intragenic) from the same isochores of the human genome are compositionally correlated. In addition, we find linear correlations between the GC levels of codon positions and of the intergenic sequences or introns associated with the corresponding genes, as well as among the GC levels of codon positions of genes.  相似文献   

2.
A correspondence analysis of codon usage in human genes revealed, as expected, that the first axis is strongly correlated with the base composition at synonymous third codon positions. At one extreme of the second axis were localized genes with a high frequency of NCG and CGN codons. The great majority of these sequences were embedded in CpG islands, while the opposite is true for the genes placed at the other extreme. The two main conclusions of this paper are: (1) the influence of CpG islands on codon usage, and (2) since the second axis is orthogonal (and therefore independent) of the first, GC3-rich genes are not necessarily associated with CpG islands.  相似文献   

3.
4.
Summary The genomes of human viruses herpes simplex 1 (HSV1) and varicella zoster (VZV), although similar in biology, largely concordant in gene order, and identical in many amino acid segments, differ widely in their genomic G+C (abbreviated S) content, which is high in HSV1 (68%) and low in VZV (46%). This paper analyzes several striking codon usage contrasts. The S difference in coding regions is dramatically large in codon site 3, S3, about 42%. The large difference in S3 is maintained at the same level in a subset of closely similar genes and even in corresponding identical amino acid blocks. A similar difference in S levels in silent site 1 (S1) is found in leucine and arginine. The difference in S3 levels occurs in every gene and in every multicodon amino acid form. The S difference also exists in amino acid usage, with HSV1 using significantly more codon types SSN, while VZV uses more codon types WWN (where W stands for A or T). The nonoverlapping and narrow histograms of S3 gene frequencies in both viruses suggest that the difference has arisen and been maintained by a process of selective rather than nonselective effects. This is in sharp contrast to the relatively large variance seen for highly similar genes in the human versus yeast analysis. Interpretations and hypotheses to explain the HSV1 vs VZV condon usage disparity relate to virus-host interactions, to the role of viral genes in DNA metabolism, to availability of molecular resources (molecular Gause exclusion principle), and to differences in genomic structure.  相似文献   

5.
Summary We have investigated the relationship between the G + C content of silent (synonymous) sites in codons and the amino acid composition of encoded proteins for approximately 1,600 human genes. There are positive correlations between silent site G + C and the proportions of codons for Arg, Pro, Ala, Trp, His, Gln, and Leu and negative ones for Tyr, Phe, Asn, Ile, Lys, Asp, Thr, and Glu. The median proteins coded by groups of genes that differ in silent-site G + C content also differ in amino acid composition, as do some proteins coded by homologous genes. The pattern of compositional change can be largely explained by directional mutation pressure, the genetic code, and differences in the frequencies of accepted amino acid substitutions; the shifts in protein composition are likely to be selectively neutral.Offprint requests to: D.W. Collins  相似文献   

6.
Summary Based on the rates of synonymous substitution in 42 protein-codin gene pairs from rat and human, a correlation is shown to exist between the frequency of the nucleotides in all positions of the codon and the synonymous substitution rate. The correlation coefficients were positive for A and T and negative for C and G. This means that AT-rich genes accumulate more synonymous substitutions than GC-rich genes. Biased patterns of mutation could not account for this phenomenon. Thus, the variation in synonymous substitution rates and the resulting unequal codon usage must be the consequence of selection against A and T in synonymous positions. Most of the varition in rates of synonymous substitution can be explained by the nucleotide composition in synonymous positions. Codon-anticodon interactions, dinucleotide frequencies, and contextual factors influence neither the rates of synonymous substitution nor codon usage. Interestingly, the nucleotide in the second position of codons (always a nonsynonymous position) was found to affect the rate of synonymous substitution. This finding links the rate of nonsynonymous substitution with the synonymous rate. Consequently, highly conservative proteins are expected to be encoded by genes that evolve slowly in terms of synonymous substitutions, and are consequently highly biased in their codon usage.  相似文献   

7.
The compositional distributions of large (main-band) DNA fragments from eight birds belonging to eight different orders (including both paleognathous and neognathous species) are very broad and extremely close to each other. These findings, which are paralleled by the compositional similarity of homologous coding sequences and their codon positions, support the idea that birds are a monophyletic group.The compositional distribution of third-codon positions of genes from chicken, the only avian species for which a relatively large number of coding sequences is known, is very broad and bimodal, the minor GC-richer peak reaching 100% GC. The very high compositional heterogeneity of avian genomes is accompanied (as in the case of mammalian genomes) by a very high speciation rate compared to cold-blooded vertebrates which are characterized by genomes that are much less heterogeneous. The higher GC levels attained by avian compared to mammalian genomes might be correlated with the higher body temperature (41–43°C) of birds compared to mammals (37°C).A comparison of GC levels of coding sequences and codon positions from man and chicken revealed very close average GC levels and standard deviations. Homologous coding sequences and codon positions from man and chicken showed a surprisingly high degree of compositional similarity which was, however, higher for GC-poor than for GC-rich sequences. This indicates that GC-poor isochores of warm-blooded vertebrates reflect the composition of the isochores of the genome of the common reptilian ancestor of mammals and birds, which underwent only a small compositional change at the transition from cold- to warm-blooded vertebrates. In contrast, the GC-rich isochores of birds and mammals are the result of large compositional changes at the same evolutionary transition, where were in part different in the two classes of warm-blooded vertebrates.Correspondence to: G. Bernaadi  相似文献   

8.
Sau K  Gupta SK  Sau S  Mandal SC  Ghosh TC 《Bio Systems》2006,85(2):107-113
Synonymous codon and amino acid usage biases have been investigated in 903 Mimivirus protein-coding genes in order to understand the architecture and evolution of Mimivirus genome. As expected for an AT-rich genome, third codon positions of the synonymous codons of Mimivirus carry mostly A or T bases. It was found that codon usage bias in Mimivirus genes is dictated both by mutational pressure and translational selection. Evidences show that four factors such as mean molecular weight (MMW), hydropathy, aromaticity and cysteine content are mostly responsible for the variation of amino acid usage in Mimivirus proteins. Based on our observation, we suggest that genes involved in translation, DNA repair, protein folding, etc., have been laterally transferred to Mimivirus a long ago from living organism and with time these genes acquire the codon usage pattern of other Mimivirus genes under selection pressure.  相似文献   

9.
Summary. Previous studies showed that the cellular amino acid composition obtained by amino acid analysis of whole cells, differs such as eubacteria, protozoa, fungi and mammalian cells. These results suggest that the difference in the cellular amino acid composition reflects biological changes as the result of evolution. However, the basic pattern of cellular amino acid composition was relatively constant in all organisms examined. In the present study, we examined archaeobacteria, because they are considered important in understanding the relationship between biological evolution and cellular amino acid composition. The cellular amino acid compositions of Archaeoglobus fulgidus, Pyrococcus horikoshii, Methanobacterium thermoautotrophicum and Methanococcus jannaschii differed slightly from each other, but were similar to those determined from codon usage data, based on the complete genomes. Thus, the cellular amino acid composition reflects biological evolution. We suggest that primitive forms of life appearing on earth at the end of prebiotic evolution had a similar-cellular amino acid composition. Received November 28, 2000 Accepted January 30, 2001  相似文献   

10.
Summary The compositional distribution of coding sequences from five vertebrates (Xenopus, chicken, mouse, rat, and human) is shifted toward higher GC values compared to that of the DNA molecules (in the 35–85-kb size range) isolated from the corresponding genomes. This shift is due to the lower GC levels of intergenic sequences compared to coding sequences. In the cold-blooded vertebrate, the two distributions are similar in that GC-poor genes and GC-poor DNA molecules are largely predominant. In contrast, in the warm-blooded vertebrates, GC-rich genes are largely predominant over GC-poor genes, whereas GC-poor DNA molecules are largely predominant over GC-rich DNA molecules. As a consequence, the genomes of warm-blooded vertebrates show a compositional gradient of gene concentration. The compositional distributions of coding sequences (as well as of DNA molecules) showed remarkable differences between chicken and mammals, and between mouse (or rat) and human. Differences were also detected in the compositional distribution of housekeeping and tissue-specific genes, the former being more abundant among GC-rich genes.  相似文献   

11.
We studied the correlations between amino acid composition and mononucleotide and dinucleotide frequencies in 115 bacterial genomes of varying G+C content. Observed amino acid frequencies were compared with those expected from the actual mononucleotide and dinucleotide frequencies. Both mononucleotide and dinucleotide frequencies correlate well with the amino acid frequency, with dinucleotide frequencies doing so better. Despite the strong correlations, some of the observed amino acid frequencies, in particular for Arg, Val, Asp, Glu, Ser, and Cys, were consistently different from predicted values in all genomes. We suggest that this variation from predicted values is a consequence of selection pressure at the level of amino acids, while the close correspondence to the predictions in residues such as Thr, Phe, Lys, and Asn arises only from mutation and selection pressure at the level of the nucleic acid sequences.  相似文献   

12.
Summary One hundred twelve human DNA sequences were analyzed with respect to dinucleotide frequency and amino acid composition. The variation in guanine and cytosine (G+C) content revealed: (1) at 2–3 and 3-1 doublet positions CG discrimination is attenuated at high G+C, but TA disfavor is enhanced, and (2) several amino acids are subject to G+C change. These findings have been reported in part for collections of sequences from various species. The present study confirms that in a single organism-the human-the G+C effects do exist. Aspects of the argument that connects G+C with protein thermal stability are also discussed.  相似文献   

13.
Summary High performance liquid chromatography was used to analyze the amino acid composition of cells. A total of 17 amino acids was analyzed. This method was used to compare the amino acid compositions of the following combinations: primary culture and established cells, normal and transformed cells, mammalian and bacterial cells, andEscherichia coli andStaphylococcus aureus. The amino acid compositions of mammalian cells were similar, but the amino acid compositions ofEscherichia coli andStaphylococcus aureus differed not only from mammalian cells, but also from each other. It was concluded that amino acid composition is almost independent of cell establishment and cell transformation, and that the amino acid compositions of mammalian and bacterial cells differ. Thus, it is likely that changes in amino acid composition due to cell transformation or species differences between mammalian cells are negligible compared with the differences between mammalian and bacterial cells, which are more distantly related.  相似文献   

14.
Singer GA  Hickey DA 《Gene》2003,317(1-2):39-47
A number of recent studies have shown that thermophilic prokaryotes have distinguishable patterns of both synonymous codon usage and amino acid composition, indicating the action of natural selection related to thermophily. On the other hand, several other studies of whole genomes have illustrated that nucleotide bias can have dramatic effects on synonymous codon usage and also on the amino acid composition of the encoded proteins. This raises the possibility that the thermophile-specific patterns observed at both the codon and protein levels are merely reflections of a single underlying effect at the level of nucleotide composition. Moreover, such an effect at the nucleotide level might be due entirely to mutational bias. In this study, we have compared the genomes of thermophiles and mesophiles at three levels: nucleotide content, codon usage and amino acid composition. Our results indicate that the genomes of thermophiles are distinguishable from mesophiles at all three levels and that the codon and amino acid frequency differences cannot be explained simply by the patterns of nucleotide composition. At the nucleotide level, we see a consistent tendency for the frequency of adenine to increase at all codon positions within the thermophiles. Thermophiles are also distinguished by their pattern of synonymous codon usage for several amino acids, particularly arginine and isoleucine. At the protein level, the most dramatic effect is a two-fold decrease in the frequency of glutamine residues among thermophiles. These results indicate that adaptation to growth at high temperature requires a coordinated set of evolutionary changes affecting (i) mRNA thermostability, (ii) stability of codon-anticodon interactions and (iii) increased thermostability of the protein products. We conclude that elevated growth temperature imposes selective constraints at all three molecular levels: nucleotide content, codon usage and amino acid composition. In addition to these multiple selective effects, however, the genomes of both thermophiles and mesophiles are often subject to superimposed large changes in composition due to mutational bias.  相似文献   

15.
Abstract: Intact neurofilaments were isolated from bovine spinal cord white matter, washed by sedimentation in 0.1 m -NaCl, and extracted with 8 m -urea. Solubilized neurofilament triplet proteins of molecular weights approximately 68,000 (P68), 150,000 (P150), and 200,000 (P200) were purified by preparative electrophoresis, using an LKB 7900 Uniphor apparatus. The method provides for an enhanced yield of purified protein and has markedly reduced admixture of electrophoresed protein with acrylamide and associated protein contaminants. Amino acid compositions of the purified neurofilament triplet proteins are reported and compared.  相似文献   

16.
17.
Abstract

It is known that the recognition of AUG triplet by eukaryotic ribosomes as a translation start site strongly depends on its nucleotide context. However, the relative significance of different context positions is not fully clear. In particular, it concerns the role of 3′-end part of the context located at the beginning of the protein-coding sequence. The significant bias observed in nucleotide frequencies in positions +4, +5, +6 (corresponding to the second codon of CDS) could result from different reasons and their contribution to start codon recognition and initiation of translation is under discussion. In this study, we conducted a comparative computational analysis of the human mRNA samples containing different nucleotides (adenine, guanine or pyrimidine) in the essential context position ?3. It was found that the presence of G in position +4 could be important for the context variant GnnAUG but not for AnnAUG. Interestingly, the second position of proteins encoded by mRNAs with AnnAUG context variant was specifically and significantly enriched with serine whereas the presence of GnnAUG context also correlated with a higher occurrence of alanine and glycine. It is likely that the efficiency of translation initiation process can depend on the interplay between 5′-context part, 3′-context part and the type of amino acid in the second position of the encoded protein.  相似文献   

18.
Bacteria and archaea have evolved with the ability to fix atmospheric dinitrogen in the form of ammonia, catalyzed by the nitrogenase enzyme complex which comprises three structural genes nifK, nifD and nifH. The nifK and nifD encodes for the beta and alpha subunits, respectively, of component 1, while nifH encodes for component 2 of nitrogenase. Phylogeny based on nifDHK have indicated that Cyanobacteria is closer to Proteobacteria alpha and gamma but not supported by the tree based on 16SrRNA. The evolutionary ancestor for the different trees was also different. The GC1 and GC2% analysis showed more consistency than GC3% which appeared to below for Firmicutes, Cyanobacteria and Euarchaeota while highest in Proteobacteria beta and clearly showed the proportional effect on the codon usage with a few exceptions. Few genes from Firmicutes, Euryarchaeota, Proteobacteria alpha and delta were found under mutational pressure. These nif genes with low and high GC3% from different classes of organisms showed similar expected number of codons. Distribution of the genes and codons, based on codon usage demonstrated opposite pattern for different orientation of mirror plane when compared with each other. Overall our results provide a comprehensive analysis on the evolutionary relationship of the three structural nif genes, nifK, nifD and nifH, respectively, in the context of codon usage bias, GC content relationship and amino acid composition of the encoded proteins and exploration of crucial statistical method for the analysis of positive data with non-constant variance to identify the shape factors of codon adaptation index.  相似文献   

19.
Summary All the codons of the genetic code can be arranged into the closed one-step mutation ring, containing three periods of the same sequence of mutations (2,3,3,3,1,3,3,3,1,3,3,3,1,3,3,3,2,3,3,3). The codons of Gly play a role of the connecting element between the end of the third, and the beginning of the first period of the genetic code. The reactivity of amino acids, expressed by the reaction rates of aminolysis reaction of N-hydroxysuccinimide esters of protected amino acids with p-anisidine, changes periodically with the respect to the mutation periods of the genetic code. Chou-Fasman P as well as P conformational parameters of amino acids, and also the compositional frequencies of amino acids in proteins, demonstrate the pseudosymmetry pattern with respect to the center of one-step mutation ring, which is situated between Thr ACY and ACR codons.  相似文献   

20.
Summary A well preserved nutritional status is beneficial in chronically uremic patients for slowing the pace of deterioration of renal function, and delaying the need for dialysis therapy. The purpose of this study was to assess the nutritional profile of 10 patients in a steady state of advanced CRF, and of 15 patients with terminal renal failure immediately prior to their first hemodialysis session (J0), and 7, 14, 45, 60, days post start of dialysis. Patients were 18 to 65 years old with total plasma proteins 60g/1. Plasma concentrations of amino acids, nutrition proteins, apolipoproteins A1, and B were evaluated. Non inflammatory reaction was evaluated by determination of alpha-1-acid glycoprotein, and C reactive protein. The data (mean ± 1 SD) were compared with mean values of 15 healthy individuals.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号