首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 468 毫秒
1.
The mammalian genome is not a random sequence but shows a specific, evolutionarily conserved structure that becomes manifest in its isochore pattern. Isochores, i.e. stretches of DNA with a distinct sequence composition and thus a specific GC content, cause the chromosomal banding pattern. This fundamental level of genome organization is related to several functional features like the replication timing of a DNA sequence. GC richness of genomic regions generally corresponds to an early replication time during S phase. Recently, we demonstrated this interdependency on a molecular level for an abrupt transition from a GC-poor isochore to a GC-rich one in the NF1 gene region; this isochore boundary also separates late from early replicating chromatin. Now, we analyzed another genomic region containing four isochores separated by three sharp isochore transitions. Again, the GC-rich isochores were found to be replicating early, the GC-poor isochores late in S phase; one of the replication time zones was discovered to consist of one single replicon. At the boundaries between isochores, that all show no special sequence elements, the replication machinery stopped for several hours. Thus, our results emphasize the importance of isochores as functional genomic units, and of isochore transitions as genomic landmarks with a key function for chromosome organization and basic biological properties.  相似文献   

2.
3.
An isochore map of the human genome based on the Z curve method   总被引:4,自引:0,他引:4  
Zhang CT  Zhang R 《Gene》2003,317(1-2):127-135
The distribution of the G+C content in the human genome has been studied by using a windowless technique derived from the Z curve method. The most important findings presented in this paper are twofold. First, abrupt variations of the G+C content along human chromosome sequences are the main variation patterns of G+C content. It is found that at some sites, the G+C content undergoes abrupt changes from a G+C-rich region to a G+C-poor region alternatively and vice versa. Second, it is shown that long domains with relatively homogeneous G+C content along each chromosome do exist. These domains are thought to be isochores, which usually have sharp boundaries. Consequently, 56 isochores longer than 3 Mb have been identified in chromosomes 1-22, X and Y. Boundaries, size and G+C content of each isochore identified are listed in detail. As an example to demonstrate the power of the method, the boundary between the Classes III and II isochores of the MHC sequence has been determined and found to be at 2,477,936, which is in good agreement with the experimental evidence. A homogeneity index is introduced to measure the homogeneity of G+C content in isochores. We emphasize that the homogeneity of G+C content is relative. The isochores in which the G+C content keeps absolutely constant do not exist. Isochore structures appear to be a basic organization of the human genome. Due to the relevance to many important biological functions, the clarification of isochore structures will provide much insight into the understanding of the human genome.  相似文献   

4.
Characteristics of human and mouse orthologous gene sequences which have large G+C content variations were investigated in this study. The orthologous gene pairs were classified into two groups according to the deviation between human and mouse G+C content at the third codon position (GC3) and were subsequently analyzed. In one group, mouse genes had higher GC3 than the corresponding human genes and in another group, human genes had higher GC3 than mouse. Furthermore, the orthologous pairs were separated based on the deviation between human or mouse GC3 and the G+C content at the third codon position of identical codons (IC3), to examine the effect of increased or decreased G+C content in human or mouse sequences. The nucleotide substitution patterns between human and mouse sequences in the two groups were remarkably distinct, and consistent with the state of G+C-rich or G+C-poor sequences. The effect of increase or decrease of G+C content in human or mouse sequences was not clear in the nucleotide substitution patterns. The chromosomal locations of human and mouse orthologous gene pairs were different between the two groups. The genes located on an identical syntenic segment showed the trend of having similar G+C content. Moreover, the same gene order of some genes on different chromosomes of both species demonstrated the gene rearrangements between human and mouse. Our study indicated that the chromosomal locations and rearrangements are associated with the GC3 variation between human and mouse sequences.Key Words: Human mouse orthologs, G+C content variation, nucleotide substitution, gene location, gene rearrangement.  相似文献   

5.
The relative contribution of mutation and selection to the G+C content of DNA was analyzed in bacterial species having widely different G+C contents. The analysis used two methods that were developed previously. The first method was to plot the average G+C content of a set of nucleotides against the G+C content of the third codon position for each gene. This method was used to present the G+C distribution of the third codon position and to assess the relative neutrality of a set of nucleotides to that of the G+C content of the third codon position. The second method was to plot the intrastrand bias of the third codon position from Parity Rule 2 (PR2), where A=T and G=C. It was found that whereas intragenomic distributions of the DNA G+C content of these bacteria are narrow in the majority of species, in some species the G+C content of the minor class of genes distributes over wider ranges than the major class of genes. On the other hand, ubiquitous PR2 biases are amino acid specific and independent of the G+C content of DNA, so that when averaged over the amino acids, the biases are small and not correlated with the DNA G+C content. Therefore, translation coupled PR2-biases are unlikely to explain the wide range of G+C contents among different species. Considering all data available, it was concluded that the amino acid-specific PR2 bias has only a minor effect, if any, on the average G+C content. In addition, PR2 bias patterns of different species show phylogenetic relationships, and the pattern can be as a taxal fingerprint. Received: 5 November 1998 / Accepted: 1 March 1999  相似文献   

6.
Sueoka N  Kawanishi Y 《Gene》2000,261(1):53-62
The human genome, as in other eukaryotes, has a wide heterogeneity in the DNA base composition. The evolutionary basis for this heterogeneity has been unknown. A previous study of the human genome (846 genes analyzed) has shown that, in the major range of the G+C content in the third codon position (0.25-0.75), biases from the Parity Rule 2 (PR2) among the synonymous codons of the four-codon amino acids are similar except in the highest G+C range (Sueoka, N., 1999. Translation-coupled violation of Parity Rule 2 in human genes is not the cause of heterogeneity of the DNA G+C content of third codon position. Gene 238, 53-58.). PR2 is an intra-strand rule where A=T and G=C are expected when there are no biases between the two complementary strands of DNA in mutation and selection rates (substitution rates). In this study, 14,026 human genes were analyzed. In addition, the third codon positions of two-codon amino acids were analyzed. New results show the following: (a) The G+C contents of the third codon position of human genes are scattered in the G+C range of 0.22-0.96 in the third codon position. (b) The PR2 biases are similar in the range of 0.25-0.75, whereas, in the high G+C range (0.75-0.96; 13% of the genes), the PR2-bias fingerprints are different from those of the major range. (c) Unlike the PR2 biases, the G+C contents of the third codon position for both four-codon and two-codon amino acids are all correlated almost perfectly with the G+C content of the third codon position over the total G+C ranges. These results support the notion that the directional mutation pressure, rather than the directional selection pressure, is mainly responsible for the heterogeneity of the G+C content of the third codon position.  相似文献   

7.
There is a long-standing debate in molecular evolution concerning the putative importance of GC content in adapting the thermal stabilities of DNA and RNA. Most studies of this relationship have examined broad-scale compositional patterns, for example, total GC percentages in genomes and occurrence of GC-rich isochores. Few studies have systematically examined the GC contents of individual orthologous genes from differently thermally adapted species. When this has been done, the emphasis has been on comparing large numbers of genes in only a few species. We have approached the GC-adaptation temperature hypothesis in a different manner by examining patterns of base composition of genes encoding lactate dehydrogenase-A (ldh-a) and alpha-actin (alpha-actin) from 51 species of vertebrates whose adaptation temperatures ranged from -1.86 degrees C (Antarctic fishes) to approximately 45 degrees C (desert reptile). No significant positive correlation was found between any index of GC content (GC content of the entire sequence, GC content of the third codon position [GC(3)], and GC content at fourfold degenerate sites [GC(4)]) and any index of adaptation temperature (maximal, mean, or minimal body temperature). For alpha-actin, slopes of regression lines for all comparisons did not differ significantly from zero. For ldh-a, negative correlations between adaptation temperature and total GC content, GC(3), and GC(4) were observed but were shown to be due entirely to phylogenetic influences (as revealed by independent contrast analyses). This comparison of GC content across a wide range of ectothermic ("cold-blooded") and endothermic ("warm-blooded") vertebrates revealed that frogs of the genus Xenopus, which have commonly been used as a representative cold-blooded species, in fact are outliers among ectotherms for the alpha-actin analyses, raising concern about the appropriateness of choosing these amphibians as representative of ectothermic vertebrates in general. Our study indicates that, whereas GC contents of isochores may show variation among different classes of vertebrates, there is no consistent relationship between adaptation temperature and the percentage of thermal stability-enhancing G + C base pairs in protein-coding genes.  相似文献   

8.
I show that the recognition sequences of Type II restriction systems are correlated with the G + C content of the host bacterial DNA. Almost all restriction systems with G + C rich tetranucleotide recognition sequences are found in species with A + T rich genomes, whereas G + C rich hexanucleotide and octanucleotide recognition sequences are found almost exclusively in species with G + C rich genomes. Most hexanucleotide recognition sequences found in species with A + T rich genomes are A + T rich. This distribution eliminates a substantial proportion of the potential variance in the frequency of restriction recognition sequences in the host genomes. As a consequence, almost all restriction recognition sequences, including those eight base pairs in length (Not I and Sfi I), are predicted to occur with a frequency ranging from once every 300 to once every 5,000 base pairs in the host genome. Since the G + C content of bacteriophage DNA and of the host genome are also correlated, the data presented is evidence that most Type II "restriction systems" are indeed involved in phage restriction.  相似文献   

9.
Summary The G+C content of DNA varies widely in different organisms, especially microorganisms. This variation is accompanied by changes in the nucleotide composition of silent positions in codons. (Silent positions are defined and explained in the text.) These changes are mostly neutral or near neutral, and appear to result from mutation pressure in the direction of increasing either A+T (AT pressure) or G+C(GC pressure) content. Variations in G+C content are also accompanied by substitutions at replacement positions in codons. These substituions produce changes in the amino acid content of homologous proteins. The examples studied were genes for 13 mitochondrial proteins in five species, and A and B genes for bacterial tryptophan synthase in four species.In microorganisms, varying AT and GC mutational pressures, presumably resulting from shifts in the DNA polymerase system, exert strong effects on molecular evolution by changing the G+C content of DNA. These effects may be greater than those of random drift. The effects of GC pressure on silent substitutions in the systems examined are several times as great as the effects on replacement substitutions.GC pressure is exerted on noncoding as well as coding regions in mitochondrial DNA. This is shown by the close correlation (correlation coefficient, 0.99) of the G+C content of the noncoding D loop of mitochondria with the G+C content of silent positions in the corresponding mitochondrial genes.  相似文献   

10.
The mammalian genome is organized as a mosaic of isochores, stretches of DNA with a distinct sequence composition. Isochores form the basis of the chromosomal banding pattern, which is tightly correlated with a number of structural and functional features. We have recently demonstrated that the transition from a GC-poor isochore to a GC-rich one in the NF1 gene region occurs within 5 kb and demarcates genomic regions with high and low recombination frequency. We now report that the same transition zone separates early replicating from late replicating chromatin on the molecular level. At the isochore transition the replication fork is stalled in mid-S phase and can be visualized by fiber-FISH techniques as a Y-shaped structure. The switch in GC content and in replication timing is conserved between human and mouse, emphasizing the importance of the transition zones as landmarks of chromosome organization and function.  相似文献   

11.
BACKGROUND: Nucleotide substitution rates and G + C content vary considerably among mammalian genes. It has been proposed that the mammalian genome comprises a mosaic of regions - termed isochores - with differing G + C content. The regional variation in gene G + C content might therefore be a reflection of the isochore structure of chromosomes, but the factors influencing the variation of nucleotide substitution rate are still open to question. RESULTS: To examine whether nucleotide substitution rates and gene G + C content are influenced by the chromosomal location of genes, we compared human and murid (mouse or rat) orthologues known to belong to one of the chromosomal (autosomal) segments conserved between these species. Multiple members of gene families were excluded from the dataset. Sets of neighbouring genes were defined as those lying within 1 centiMorgan (cM) of each other on the mouse genetic map. For both synonymous substitution rates and G + C content at silent sites, neighbouring genes were found to be significantly more similar to each other than sets of genes randomly drawn from the dataset. Moreover, we demonstrated that the regional similarities in G + C content (isochores) and synonymous substitution rate were independent of each other. CONCLUSIONS: Our results provide the first substantial statistical evidence for the existence of a regional variation in the synonymous substitution rate within the mammalian genome, indicating that different chromosomal regions evolve at different rates. This regional phenomenon which shapes gene evolution could reflect the existence of 'evolutionary rate units' along the chromosome.  相似文献   

12.
CpG and TpA dinucleotides are underrepresented in the human genome. The CpG deficiency is due to the high mutation rate from C to T in methylated CpG's. The TpA suppression was thought to reflect a counterselection against TpA's destabilizing effect in RNA. Unexpectedly, the TpA and CpG deficiencies vary according to the G+C contents of sequences. It has been proposed that the variation in CpG suppression was correlated with a particular chromatin organization in G+C-rich isochores. Here, we present an improved model of dinucleotide evolution accounting for the overlap between successive dinucleotides. We show that an increased mutation rate from CpG to TpG or CpA induces both an apparent TpA deficiency and a correlation between CpG and TpA deficiencies and G+C content. Moreover, this model shows that the ratio of observed over expected CpG frequency underestimates the real CpG deficiency in G+C-rich sequences. The predictions of our model fit well with observed frequencies in human genomic data. This study suggests that previously published selectionist interpretations of patterns of dinucleotide frequencies should be taken with caution. Moreover, we propose new criteria to identify unmethylated CpG islands taking into account this bias in the measure of CpG depletion.  相似文献   

13.
Mouse satellite DNA sequences isolated by centrifugation in CS2SO4--Ag+ gradients are analyzed for buoyant density by CSCl density gradients and for their content of fast reassociating sequences by denaturation and partial reassociation. Our data suggest that in CS2SO4 gradients silver ions separate a satellite band which contains both fast reassociating G+C rich sequences and slow reassociating, A+T rich DNA sequences.  相似文献   

14.
To date, the sequences of 45 Bradyrhizobium japonicum genes are known. This provides sufficient information to determine their codon usage and G+C content. Surprisingly, B. japonicum nodulation and NifA-regulated genes were found to have a less biased codon usage and a lower G+C content than genes not belonging to these two groups. Thus, the coding regions of nodulation genes and NifA-regulated genes could hardly be identified in codon preference plots whereas this was not difficult with other genes. The codon frequency table of the highly biased genes was used in a codon preference plot to analyze the RSRj9 sequence which is an insertion sequence (IS)-like element. The plot helped identify a new open reading frame (ORF355) that escaped previous detection because of two sequencing errors. These were now corrected. The deduced gene product of ORF355 in RSRj9 showed extensive similarity to a putative protein encoded by an ORF in the T-DNA of Agrobacterium rhizogenes. The DNA sequences bordering both ORFs showed inverted repeats and potential target site duplications which supported the assumption that they were IS-like elements.  相似文献   

15.
The human genome is described in the literature as being composed of the isochores, i.e., long (hundreds of kilobases) segments with a homogeneous (G + C) content. We calculated the (G + C) content variations along the DNA molecules of the human chromosomes 21 and 22 and found the variations to be higher everywhere compared to the randomized sequences. Hence the (G + C) content is certainly not homogeneous on the isochore scale in the two human chromosomes. In addition, we found no significant difference between the two human molecules and the genome of E. coli regarding the (G + C) content variations. Hence no isochores are either present in the DNA molecules of the human chromosomes 21 and 22, or the isochores are also present in the genome of Escherichia coli. In any case, the present communication demonstrates that the isochores should be defined in unambiguous molecular terms if they are to be used for an up-to-date genome structure characterization.  相似文献   

16.
It has been suggested that isochores are maintained by mutation biases, and that this leads to variation in the rate of mutation across the genome. A model of DNA replication is presented in which the probabilities of misincorporation and proofreading are affected by the composition and concentration of the free nucleotide pools. The relationship between sequence G+C content and the mutation rate is investigated. It is found that there is very little variation in the mutation rate between sequences of different G+C contents if the total concentration of the free nucleotides remains constant. However, variation in the mutation rate can be arbitrarily large if some mismatches are proofread and the total concentration of free nucleotides varies. Hence the model suggests that the maintenance of isochores by the replication of DNA in free nucleotide pools of biased composition does not lead per se to mutation rate variance. However, it is possible that changes in composition could be accompanied by changes in concentration, thus generating mutation rate variance. Furthermore, there is the possibility that germ-line selection could lead to alterations in the overall free nucleotide concentration through the cell cycle. These findings are discussed with reference to the variance in mammalian silent substitution rates.  相似文献   

17.
Pink CJ  Hurst LD 《PloS one》2011,6(9):e24480
In mammals sequences that are either late replicating or highly recombining have high rates of evolution at putatively neutral sites. As early replicating domains and highly recombining domains both tend to be GC rich we a priori expect these two variables to covary. If so, the relative contribution of either of these variables to the local neutral substitution rate might have been wrongly estimated owing to covariance with the other. Against our expectations, we find that sex-averaged recombination rates show little or no correlation with replication timing, suggesting that they are independent determinants of substitution rates. However, this result masks significant sex-specific complexity: late replicating domains tend to have high recombination rates in females but low recombination rates in males. That these trends are antagonistic explains why sex-averaged recombination is not correlated with replication timing. This unexpected result has several important implications. First, although both male and female recombination rates covary significantly with intronic substitution rates, the magnitude of this correlation is moderately underestimated for male recombination and slightly overestimated for female recombination, owing to covariance with replicating timing. Second, the result could explain why male recombination is strongly correlated with GC content but female recombination is not. If to explain the correlation between GC content and replication timing we suppose that late replication forces reduced GC content, then GC promotion by biased gene conversion during female recombination is partly countered by the antagonistic effect of later replicating sequence tending increase AT content. Indeed, the strength of the correlation between female recombination rate and local GC content is more than doubled by control for replication timing. Our results underpin the need to consider sex-specific recombination rates and potential covariates in analysis of GC content and rates of evolution.  相似文献   

18.
Correlation was positive between the G + C content at the codon third position in genes of vertebrates and the G + C content of the genome portion surrounding each gene. Exons of genes with a high G + C% at the codon 3rd position are surrounded by G + C-rich introns and G + C-rich flanking sequences, and those with a low G + C% at the position by A + T-rich introns and flanking sequences. Analysis of G + C content distribution along DNA sequences using a DNA Sequence Data Bank supported the view that the vertebrate genome is a mosaic of regions with clear differences in their G + C content. The biological significance of the variation in G + C content throughout the vertebrate genome is discussed in connection with chromosomal banding.  相似文献   

19.
The mammalian chromosomes present specific sites of gaps or breaks, the common fragile sites (CFSs), when the cells are exposed to DNA replication stress or to some DNA binding compounds. CFSs span hundreds or thousands of kilobases. The analysis of these sequences has not definitively clarified the causes of their fragility. There is considerable evidence that CFSs are regions of late or slowed replication in the presence of sequence elements that have the propensity to form secondary structures, and that the cytogenetic expression of CFSs may be due to unreplicated DNA. In order to analyse the relationship between DNA replication time and fragility, in this work we have investigated the timing of replication of sequences mapping within two CFSs (FRA1H and FRA2G), of syntenic non-fragile sequences and of early and late replicating control sequences by using fluorescent in situ hybridization on interphase nuclei, conventional fluorescence microscopy and confocal microscopy. Our results indicate that the fragile sequences are slow replicating and that they enter G2 phase unreplicated with very high frequency. Thus these regions could sometimes reach mitosis unreplicated or undercondensed and be expressed as chromosome gaps/breakages.  相似文献   

20.
Past analyses of the genome of the yeast Saccharomyces cerevisiae have revealed substantial regional variation in G+C content. Important questions remain, though, as to the origin, nature, significance, and generality of this variation. We conducted an extensive analysis of the yeast genome to try to answer these questions. Our results indicate that open reading frames (ORFs) with similar G+C contents at silent codon positions are significantly clustered on chromosomes. This clustering can be explained by very short range correlations of silent-site G+C contents at neighboring ORFs. ORFs of high silent-site G+C content are disproportionately concentrated on shorter chromosomes, which causes a negative relationship between chromosome length and G+C content. Contrary to previous reports, there is no correlation between gene density and silent-site G+C content in yeast. Chromosome III is atypical in many regards, and possible reasons for this are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号