首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Correlation was positive between the G + C content at the codon third position in genes of vertebrates and the G + C content of the genome portion surrounding each gene. Exons of genes with a high G + C% at the codon 3rd position are surrounded by G + C-rich introns and G + C-rich flanking sequences, and those with a low G + C% at the position by A + T-rich introns and flanking sequences. Analysis of G + C content distribution along DNA sequences using a DNA Sequence Data Bank supported the view that the vertebrate genome is a mosaic of regions with clear differences in their G + C content. The biological significance of the variation in G + C content throughout the vertebrate genome is discussed in connection with chromosomal banding.  相似文献   

2.
The mitochondrial genome of wild-type yeast cells. IV. Genes and spacers   总被引:12,自引:0,他引:12  
The organization of the mitochondrial genome of wild-type Saccharomyces cerevisiae cells has been investigated further, by degrading mitochondrial DNA with micrococcal nuclease. Under the conditions used, this enzyme very strongly degrades the A + T-rich stretches (spacers) whereas it only inflicts a limited number of breaks into the G + C-rich stretches (genes). The macromolecular fragments derived from the “genes” have been separated from the oligonucleotides originating from the “spacers” by gel filtration, and both sorts of products have been investigated. It has been shown (a) that the spacers are very homogeneous in base composition and have a G + C content lower than 5% (mitochondrial DNA has a G + C content of 18%); (b) that the genes are very heterogeneous in base composition, the G + C content ranging from about 25% to 50%, when the average size of the fragments is 1·2 × 105; smaller fragments, molecular weight 4 × 104, having a G + C level as high as 65%, have been isolated in a yield of 10%; the average G + C content of genes is about 32%; (c) that genes and spacers are present in about equal amounts in the mitochondrial genome and that they have comparable average sizes.  相似文献   

3.
Isochore structures in the mouse genome   总被引:2,自引:0,他引:2  
Zhang CT  Zhang R 《Genomics》2004,83(3):384-394
The distribution of the G+C content in the mouse genome has been studied using a windowless technique. We have found that: (i). Abrupt variations of the G+C content from a GC-rich region to a GC-poor region, and vice versa, occur frequently at some sites along the sequence of the mouse genome. (ii). Long domains with relatively homogeneous G+C content (isochores) exist, which usually have sharp boundaries. Consequently, 28 isochores longer than 1 Mb have been identified in the mouse genome. A homogeneity index was used to quantify the variations of the G+C content within isochores. The precise boundaries, sizes, and G+C contents of these isochores have been determined. The windowless technique for the G+C content computation was also used to analyze the DNA sequence containing the mouse MHC region, which has a GC-poor isochore. This isochore is located at the central part of the sequence with boundaries at 468459 and 812716 bp, where the sequence is extended from the centromeric end to the telomeric end. In addition, the analysis of a segment of the rat genome shows that the rat genome also has clear isochore structures.  相似文献   

4.
The global, rather than local, variation in G+C content along the nuclear DNA sequences of various organisms was studied using GenBank sequence data. When long DNA sequences of the genomes of Escherichia coli and Saccharomyces cerevisiae were examined, the levels of their G+C content (G+C%) were found to be within a narrow range around that of the whole genome. The G+C% levels for sequences of vertebrate genomes, however, were found to cover a wide range, showing that their genome is a mosaic of sequences with different G+C% levels, in each of which the sequence is fairly homogeneous in its G+C% for a very long distance. Through surveying a human genetic map and GenBank DNA sequences, the global variations in G+C% along the human genome DNA were found to be correlated with chromosome band structures.  相似文献   

5.
Sueoka N  Kawanishi Y 《Gene》2000,261(1):53-62
The human genome, as in other eukaryotes, has a wide heterogeneity in the DNA base composition. The evolutionary basis for this heterogeneity has been unknown. A previous study of the human genome (846 genes analyzed) has shown that, in the major range of the G+C content in the third codon position (0.25-0.75), biases from the Parity Rule 2 (PR2) among the synonymous codons of the four-codon amino acids are similar except in the highest G+C range (Sueoka, N., 1999. Translation-coupled violation of Parity Rule 2 in human genes is not the cause of heterogeneity of the DNA G+C content of third codon position. Gene 238, 53-58.). PR2 is an intra-strand rule where A=T and G=C are expected when there are no biases between the two complementary strands of DNA in mutation and selection rates (substitution rates). In this study, 14,026 human genes were analyzed. In addition, the third codon positions of two-codon amino acids were analyzed. New results show the following: (a) The G+C contents of the third codon position of human genes are scattered in the G+C range of 0.22-0.96 in the third codon position. (b) The PR2 biases are similar in the range of 0.25-0.75, whereas, in the high G+C range (0.75-0.96; 13% of the genes), the PR2-bias fingerprints are different from those of the major range. (c) Unlike the PR2 biases, the G+C contents of the third codon position for both four-codon and two-codon amino acids are all correlated almost perfectly with the G+C content of the third codon position over the total G+C ranges. These results support the notion that the directional mutation pressure, rather than the directional selection pressure, is mainly responsible for the heterogeneity of the G+C content of the third codon position.  相似文献   

6.
Gao F  Zhang CT 《The FEBS journal》2006,273(8):1637-1648
The availability of the complete chicken genome sequence provides an unprecedented opportunity to study the global genome organization at the sequence level. Delineating compositionally homogeneous G + C domains in DNA sequences can provide much insight into the understanding of the organization and biological functions of the chicken genome. A new segmentation algorithm, which is simple and fast, has been proposed to partition a given genome or DNA sequence into compositionally distinct domains. By applying the new segmentation algorithm to the draft chicken genome sequence, the mosaic organization of the chicken genome can be confirmed at the sequence level. It is shown herein that the chicken genome is also characterized by a mosaic structure of isochores, long DNA segments that are fairly homogeneous in the G + C content. Consequently, 25 isochores longer than 2 Mb (megabases) have been identified in the chicken genome. These isochores have a fairly homogeneous G + C content and often correspond to meaningful biological units. With the aid of the technique of cumulative GC profile, we proposed an intuitive picture to display the distribution of segmentation points. The relationships between G + C content and the distributions of genes (CpG islands, and other genomic elements) were analyzed in a perceivable manner. The cumulative GC profile, equipped with the new segmentation algorithm, would be an appropriate starting point for analyzing the isochore structures of higher eukaryotic genomes.  相似文献   

7.
We analysed complete or almost complete nucleotide sequences of the human, chimp, mouse, rat, chicken, dog, and other genomes to find that they contain extremely long (A+T) a (G+C) blocks that do not occur at all in the corresponding randomized sequences. The longest is an (A+T) block containing 1040 consecutive AT pairs that occurs in the 16th human chromosome. The longest human (G+C) block has 261 bp in length. About a half of the longest blocks occur in introns. The (A+T) blocks are discrete units whereas the (G+C) blocks are diffuse. They are imbedded in the genome through connectors longer than 1 kilobase where the (G+C) content gradually decreases to the value of 50%. Remarkably, the (A+T) as well as (G+C) blocks are substantially shorter in the chimp genome. Chicken is characteristic by very long (G+C) blocks that are even longer than in the human genome. Though much shorter, long (G+C) and especially (A+T) blocks occur in lower organisms as well, which means that AT and GC pair clustering is an ancient property that has evolved into large scales in higher eukaryote genomes and the human genome in particular. Very long (A+T) and (G+C) blocks confer specific biophysical properties on DNA that are likely to influence genome folding in cell nuclei and its functional properties.  相似文献   

8.
A survey of minisatellites (MSs) in 5.3 Mb of randomly selected rice DNA sequences from public databases was carried out to clarify the role of transposable elements (TEs) in the dispersal of MSs in the rice genome. The estimated frequency of MSs in this sample was one per 23.4 kb, and this frequency is approximately equivalent to that of Class I microsatellites in the rice genome. Of the MSs in the 5.3-Mb sequence sample, 82% were found to be present in multiple copies in the rice genome, and all of these were a part of TE sequences. In this study at least 61 TE groups were identified as MS carriers. It was also shown that the GC-rich MS pOs6.2H, which was previously reported to be one of the interspersed MSs in the rice genome, is a component of an En / Spm -like element. These results indicate that the majority of MSs in the rice genome are maintained in TEs, and amplified and dispersed as components of the TEs. The G+C content of the multi-locus MS sequences reflected that of the TE sequences containing those MSs, but no obvious bias towards the high G+C content of DNA was observed. Single locus MSs also did not show any obvious bias towards the high G+C content of DNA in the rice genome. In this respect, the MSs in the rice genome are quite different from those in the human genome: in the latter, the majority of MSs show an obvious bias towards the high G+C content of DNA.Electronic Supplementary Material Supplementary material is available in the online version of this article at Communicated by M.-A. Grandbastien  相似文献   

9.
10.
The human genome is described in the literature as being composed of the isochores, i.e., long (hundreds of kilobases) segments with a homogeneous (G + C) content. We calculated the (G + C) content variations along the DNA molecules of the human chromosomes 21 and 22 and found the variations to be higher everywhere compared to the randomized sequences. Hence the (G + C) content is certainly not homogeneous on the isochore scale in the two human chromosomes. In addition, we found no significant difference between the two human molecules and the genome of E. coli regarding the (G + C) content variations. Hence no isochores are either present in the DNA molecules of the human chromosomes 21 and 22, or the isochores are also present in the genome of Escherichia coli. In any case, the present communication demonstrates that the isochores should be defined in unambiguous molecular terms if they are to be used for an up-to-date genome structure characterization.  相似文献   

11.
《Gene》1996,174(1):43-50
The fungus Phycomyces blakesleeanus has a relatively small genome, 30 megabases (Mb), with a low guanine and cytosine (G+C) content, 35%; the coding sequences cloned to date all have a G+C content of about 50%. In order to investigate the organization of the genome of this fungus, we have cloned and sequenced 251 DNA fragments. One hundred and twenty-six clones were obtained by digestion with MspI (target sequence 5′-CCGG-3′) and 125 random clones were obtained by sonication. The average length of sequence obtained was about 200 base pairs (bp) and the total length was about 50 kilobases (kb). The G + C content is not homogeneous throughout the genome: sequences obtained after digestion with MspI have an average of 5% more G + C content than the random fragments, and are enriched in coding sequences. Fourteen MspI fragments show similarities to known proteins and 21 encode ribosomal RNA (rRNA). By contrast, only three of the random fragments are similar to known proteins and only one to a rRNA. We conclude that the Phycomyces genome is composed of G+C-rich genes surrounded by G+C-poor areas. Two clones have similarities to the transposase of the transposon Tcl from Caenorhabditis elegans. This result suggests the presence of a high copy number of a Tcl-like transposable element in the Phycomyces genome. Another clone was similar to the transposon Txl from Xenopus laevis. A novel repetitive nt sequence has been characterized; about 5% of the total genome is a repetition of any of two consensus sequences of 31 by named PrAI and PrA2.  相似文献   

12.
Characterization of the genome of the basidiomycete Schizophyllum commune   总被引:8,自引:0,他引:8  
DNA of Schizophyllum commune was isolated both from mycelial cells and from protoplasts. Nuclear DNA was isolated after solubilization of the mitochondria with the detergent Nonidet. The G + C content of the nuclear DNA was 57%, calculated from its buoyant density (1.7165 g/ml) and from the Tm (77.4 degrees C in 15 mM NaCl/1.5 mM trisodium citrate). The buoyant density of the ribosomal cistrons was 1.707 g/ml. DNA isolated from purified mitochondria had a very low G + C content: 22% (rho = 1.6845 g/ml, Tm = 61.8 degrees C in 15 mM NaCl/1.5 mM trisodium citrate). Analysis of CsCl profiles and melting patterns suggested that mitochondrial DNA contains interspersed (A + T)-rich sequences. From reassociation analysis of sheared nuclear DNA the genome size of S. commune was determined to be 22.8 . 10(9) daltons. A small amount of DNA (0.5 . 10(9) daltons) bound to hydroxyapatite at zero time Cot. 7% of the genome (1.6 . 10(9) daltons) represented repetitive DNA.  相似文献   

13.
Stable isotope probing (SIP) of nucleic acids is a powerful tool that can identify the functional capabilities of noncultivated microorganisms as they occur in microbial communities. While it has been suggested previously that nucleic acid SIP can be performed with 15N, nearly all applications of this technique to date have used 13C. Successful application of SIP using 15N-DNA (15N-DNA-SIP) has been limited, because the maximum shift in buoyant density that can be achieved in CsCl gradients is approximately 0.016 g ml-1 for 15N-labeled DNA, relative to 0.036 g ml-1 for 13C-labeled DNA. In contrast, variation in genome G+C content between microorganisms can result in DNA samples that vary in buoyant density by as much as 0.05 g ml-1. Thus, natural variation in genome G+C content in complex communities prevents the effective separation of 15N-labeled DNA from unlabeled DNA. We describe a method which disentangles the effects of isotope incorporation and genome G+C content on DNA buoyant density and makes it possible to isolate 15N-labeled DNA from heterogeneous mixtures of DNA. This method relies on recovery of "heavy" DNA from primary CsCl density gradients followed by purification of 15N-labeled DNA from unlabeled high-G+C-content DNA in secondary CsCl density gradients containing bis-benzimide. This technique, by providing a means to enhance separation of isotopically labeled DNA from unlabeled DNA, makes it possible to use 15N-labeled compounds effectively in DNA-SIP experiments and also will be effective for removing unlabeled DNA from isotopically labeled DNA in 13C-DNA-SIP applications.  相似文献   

14.
Abstract

We analysed complete or almost complete nucleotide sequences of the human, chimp, mouse, rat, chicken, dog, and other genomes to find that they contain extremely long (A+T) a (G+C) blocks that do not occur at all in the corresponding randomized sequences. The longest is an (A+T) block containing 1040 consecutive AT pairs that occurs in the 16th human chromosome. The longest human (G+C) block has 261 bp in length. About a half of the longest blocks occur in introns. The (A+T) blocks are discrete units whereas the (G+C) blocks are diffuse. They are embeeded in the genome through connectors longer than 1 kilobase where the (G+C) content gradually decreases to the value of 50%. Remarkably, the (A+T) as well as (G+C) blocks are substantially shorter in the chimp genome. Chicken is characteristic by very long (G+C) blocks that are even longer than in the human genome. Though much shorter, long (G+C) and especially (A+T) blocks occur in lower organisms as well, which means that AT and GC pair clustering is an ancient property that has evolved into large scales in higher eukaryote genomes and the human genome in particular. Very long (A+T) and (G+C) blocks confer specific biophysical properties on DNA that are likely to influence genome folding in cell nuclei and its functional properties.  相似文献   

15.
A novel method to calculate the G+C content of genomic DNA sequences.   总被引:2,自引:0,他引:2  
The base composition of a DNA fragment or genome is usually measured by the proportion of A+T or G+C in the sequence. The G+C content along genomic sequences is usually calculated using an overlapping or non-overlapping sliding window method. The result and accuracy of such an approach depends on the size of the window and the moving distance adopted. In this paper, a novel windowless technique to calculate the G+C content of genomic sequences is proposed. By this method, the G+C content can be calculated at different "resolution". In an extreme case, the G+C content may be computed at a specific point, rather than in a window of finite size. This is particularly useful to analyze the fine variation of base composition along genomic sequences. As the first example, the variation of G+C content along each of 16 yeast chromosomes is analyzed. The G+C-rich regions with length larger than 5 kb sequences are detected and listed in details. It is found that each chromosome consists of several G+C-rich and G+C-poor regions alternatively, i.e., a mosaic structure. Another example is to analyze the G+C content for each of the two chromosomes of the Vibrio cholerae genome. Based on the variations of the G+C content in each chromosome, it is shown that some fragments in the Vibrio cholerae genome may have been transferred from other species. Especially, the position and size of the large integron island on the smaller chromosome was precisely predicted. This method would be a useful tool for analyzing genomic sequences.  相似文献   

16.
Herpesvirus sylvilagus is a lymphotropic (type gamma) herpesvirus of cottontail rabbits (Sylvilagus floridanus). Analysis of virion DNA of herpesvirus sylvilagus has revealed that the genome consists of one stretch of about 120 kilobase pairs of internal, unique DNA flanked by a variable number of 553-base-pair tandem repeats. The G + C content of the repetitive DNA is extremely high (83%), as determined by sequencing. The organization of the herpesvirus sylvilagus genome is, therefore, similar to that of the primate lymphotropic viruses herpesvirus saimiri and herpesvirus ateles.  相似文献   

17.
A temperate phage was isolated from emetic Bacillus cereus NCTC 11143 by mitomycin C and characterized by transmission electron microscopy and DNA and protein analyses. Whole genome sequencing of Bacillus phage 11143 was performed by GS-FLX. The phage has a dsDNA genome of 39,077 bp and a 35% G+C content. Bioinformatic analysis of the phage genome revealed 49 putative ORFs involved in replication, morphogenesis, DNA packaging, lysogeny, and host lysis. Bacillus phage 11143 could be classified as a member of the Siphoviridae family by morphology and genome structure. Genomic comparisons at the DNA and protein levels revealed homologous genetic modules with patterns and morphogenesis proteins similar to those of other Bacillus phages. Thus, Bacillus phages might have a mosaic genetic relationship.  相似文献   

18.
R. Garesse 《Genetics》1988,118(4):649-663
The sequence of a 8351-nucleotide mitochondrial DNA (mtDNA) fragment has been obtained extending the knowledge of the Drosophila melanogaster mitochondrial genome to 90% of its coding region. The sequence encodes seven polypeptides, 12 tRNAs and the 3' end of the 16S rRNA and CO III genes. The gene organization is strictly conserved with respect to the Drosophila yakuba mitochondrial genome, and different from that found in mammals and Xenopus. The high A + T content of D. melanogaster mitochondrial DNA is reflected in a reiterative codon usage, with more than 90% of the codons ending in T or A, G + C rich codons being practically absent. The average level of homology between the D. melanogaster and D. yakuba sequences is very high (roughly 94%), although insertion and deletions have been detected in protein, tRNA and large ribosomal genes. The analysis of nucleotide changes reveals a similar frequency for transitions and transversions, and reflects a strong bias against G + C on both strands. The predominant type of transition is strand specific.  相似文献   

19.
20.
I show that the recognition sequences of Type II restriction systems are correlated with the G + C content of the host bacterial DNA. Almost all restriction systems with G + C rich tetranucleotide recognition sequences are found in species with A + T rich genomes, whereas G + C rich hexanucleotide and octanucleotide recognition sequences are found almost exclusively in species with G + C rich genomes. Most hexanucleotide recognition sequences found in species with A + T rich genomes are A + T rich. This distribution eliminates a substantial proportion of the potential variance in the frequency of restriction recognition sequences in the host genomes. As a consequence, almost all restriction recognition sequences, including those eight base pairs in length (Not I and Sfi I), are predicted to occur with a frequency ranging from once every 300 to once every 5,000 base pairs in the host genome. Since the G + C content of bacteriophage DNA and of the host genome are also correlated, the data presented is evidence that most Type II "restriction systems" are indeed involved in phage restriction.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号