共查询到20条相似文献,搜索用时 15 毫秒
1.
Past analyses of the genome of the yeast Saccharomyces cerevisiae have revealed substantial regional variation in G+C content. Important questions remain, though, as to the origin, nature, significance, and generality of this variation. We conducted an extensive analysis of the yeast genome to try to answer these questions. Our results indicate that open reading frames (ORFs) with similar G+C contents at silent codon positions are significantly clustered on chromosomes. This clustering can be explained by very short range correlations of silent-site G+C contents at neighboring ORFs. ORFs of high silent-site G+C content are disproportionately concentrated on shorter chromosomes, which causes a negative relationship between chromosome length and G+C content. Contrary to previous reports, there is no correlation between gene density and silent-site G+C content in yeast. Chromosome III is atypical in many regards, and possible reasons for this are discussed. 相似文献
2.
The vertebrate genome is a mosaic of regions differing dramatically in their G + C content. Those regions with a high G + C content contain the expected number of CpG dinucleotides and we propose that following methylation these have been protected from deamination by the increased stability of the surrounding DNA duplex. This argument applies both to the microenvironment of the CpG dinucleotide and to whole gene regions. 相似文献
3.
Three highly mutable loci of the wall-less pathogens Mycoplasma bovis, Mycoplasma pulmonis and Mycoplasma agalactiae undergo high-frequency genomic rearrangements and generate extensive antigenic variation of major surface lipoproteins. Adjacent to each locus, an open reading frame exists as a single chromosomal copy and is predicted to encode a site-specific DNA recombinase exhibiting high homology to the recombinases XerD of Escherichia coli and CodV of Bacillus subtilis. Each of the mycoplasmal proteins are members of the lambda integrase family of tyrosine site-specific recombinases and likely mediates site-specific DNA inversions observed within the adjacent, variable loci. 相似文献
4.
The heterogeneity of gene nucleotide content in prokaryotic genomes is commonly interpreted as the result of three main phenomena: (1) genes undergo different selection pressures both during and after translation (affecting codon and amino acid choice); (2) genes undergo different mutational pressure whether they are on the leading or lagging strand; and (3) genes may have different phylogenetic origins as a result of lateral transfers. However, this view neglects the necessity of organizing genetic information on a chromosome that needs to be replicated and folded, which may add constraints to single gene evolution. As a consequence, genes are potentially subjected to different mutation and selection pressures, depending on their position in the genome. In this paper, we analyze the structuring of different codon usage measures along completely sequenced bacterial genomes. We show that most of them are highly structured, suggesting that genes have different base content, depending on their location on the chromosome. A peculiar pattern of genome structure, with a tendency toward an A+T-enrichment near the replication terminus, is found in most bacterial phyla and may reflect common chromosome constraints. Several species may have lost this pattern, probably because of genome rearrangements or integration of foreign DNA. We show that in several species, this enrichment is associated with an increase of evolutionary rate and we discuss the evolutionary implications of these results. We argue that structural constraints acting on the circular chromosome are not negligible and that this natural structuring of bacterial genomes may be a cause of overestimation in lateral gene transfer predictions using codon composition indices. 相似文献
5.
Diversity in G + C content at the third position of codons in vertebrate genes and its cause. 总被引:21,自引:11,他引:21 下载免费PDF全文
Correlation was positive between the G + C content at the codon third position in genes of vertebrates and the G + C content of the genome portion surrounding each gene. Exons of genes with a high G + C% at the codon 3rd position are surrounded by G + C-rich introns and G + C-rich flanking sequences, and those with a low G + C% at the position by A + T-rich introns and flanking sequences. Analysis of G + C content distribution along DNA sequences using a DNA Sequence Data Bank supported the view that the vertebrate genome is a mosaic of regions with clear differences in their G + C content. The biological significance of the variation in G + C content throughout the vertebrate genome is discussed in connection with chromosomal banding. 相似文献
6.
DNA of some anaerobic rumen fungi: G + C content determination 总被引:2,自引:0,他引:2
G. Billon-Grand J.B. Fiol A. Breton A. Bruyère Z. Oulhaj 《FEMS microbiology letters》1991,82(3):267-270
The nuclear DNAs from five species of anaerobic rumen fungi have been isolated and purified by means of two extraction methods (with and without 8 M urea). Their G + C contents have been characterized by the thermal denaturation procedure of Marmur and Doty. As has already been shown in Neocallimastix frontalis, the results obtained by the two techniques demonstrated a very low G + C content (less than 20%) and the constant presence of satellite DNA. 相似文献
7.
The base composition of a DNA fragment or genome is usually measured by the proportion of A+T or G+C in the sequence. The G+C content along genomic sequences is usually calculated using an overlapping or non-overlapping sliding window method. The result and accuracy of such an approach depends on the size of the window and the moving distance adopted. In this paper, a novel windowless technique to calculate the G+C content of genomic sequences is proposed. By this method, the G+C content can be calculated at different "resolution". In an extreme case, the G+C content may be computed at a specific point, rather than in a window of finite size. This is particularly useful to analyze the fine variation of base composition along genomic sequences. As the first example, the variation of G+C content along each of 16 yeast chromosomes is analyzed. The G+C-rich regions with length larger than 5 kb sequences are detected and listed in details. It is found that each chromosome consists of several G+C-rich and G+C-poor regions alternatively, i.e., a mosaic structure. Another example is to analyze the G+C content for each of the two chromosomes of the Vibrio cholerae genome. Based on the variations of the G+C content in each chromosome, it is shown that some fragments in the Vibrio cholerae genome may have been transferred from other species. Especially, the position and size of the large integron island on the smaller chromosome was precisely predicted. This method would be a useful tool for analyzing genomic sequences. 相似文献
8.
Buckley DH Huangyutitham V Hsu SF Nelson TA 《Applied and environmental microbiology》2007,73(10):3189-3195
Stable isotope probing (SIP) of nucleic acids is a powerful tool that can identify the functional capabilities of noncultivated microorganisms as they occur in microbial communities. While it has been suggested previously that nucleic acid SIP can be performed with 15N, nearly all applications of this technique to date have used 13C. Successful application of SIP using 15N-DNA (15N-DNA-SIP) has been limited, because the maximum shift in buoyant density that can be achieved in CsCl gradients is approximately 0.016 g ml-1 for 15N-labeled DNA, relative to 0.036 g ml-1 for 13C-labeled DNA. In contrast, variation in genome G+C content between microorganisms can result in DNA samples that vary in buoyant density by as much as 0.05 g ml-1. Thus, natural variation in genome G+C content in complex communities prevents the effective separation of 15N-labeled DNA from unlabeled DNA. We describe a method which disentangles the effects of isotope incorporation and genome G+C content on DNA buoyant density and makes it possible to isolate 15N-labeled DNA from heterogeneous mixtures of DNA. This method relies on recovery of "heavy" DNA from primary CsCl density gradients followed by purification of 15N-labeled DNA from unlabeled high-G+C-content DNA in secondary CsCl density gradients containing bis-benzimide. This technique, by providing a means to enhance separation of isotopically labeled DNA from unlabeled DNA, makes it possible to use 15N-labeled compounds effectively in DNA-SIP experiments and also will be effective for removing unlabeled DNA from isotopically labeled DNA in 13C-DNA-SIP applications. 相似文献
9.
Summary We develop a mathematical model for estimating evolutionary distance from restriction enzyme maps, which incorporates non-uniformity of the rate of base substitution into the theory and allows for an arbitrary G+C content at equilibrium. When the G+C content differs significantly from 1/2, the traditional model of base changes can introduce a systematic bias which depends upon the base composition of the restriction site. In addition, the accuracy of estimated evolutionary distance depends heavily upon the choice of restriction enzyme in that the expected number of sites is also affected. Monte Carlo experiments are conducted to check the validity of the present theoretical treatment and from which we draw several cautionary notes on estimation. An application is made to the available data on restriction enzyme maps of human mitochondrial DNA where the G+C content is approximately 1/3.Contribution No. 1372 from the National Institute of Genetics, Mishima, 411 Japan 相似文献
10.
The Gesneriaceae (Lamiales) is a family of flowering plants comprising >3000 species of mainly tropical origin, the most familiar of which is the cultivated African violet (Saintpaulia spp.). Species of Gesneriaceae are poorly represented in the lists of taxa sampled for genome size estimation; measurements are available for three species of Ramonda and one each of Haberlea, Saintpaulia, and Streptocarpus, all species of Old World origin. We report here nuclear genome size estimates for 10 species of Sinningia, a neotropical genus largely restricted to Brazil. Flow cytometry of leaf cell nuclei showed that holoploid genome size in Sinningia is very small (approximately two times the size of the Arabidopsis genome), and is small compared to the other six species of Gesneriaceae with genome size estimates. We also documented intraspecific genome size variation of 21%-26% within a group of wild Sinningia speciosa (Lodd.) Hiern collections. In addition, we analyzed 1210 genome survey sequences from S. speciosa to characterize basic features of the nuclear genome such as guanine-cytosine content, types of repetitive elements, numbers of protein-coding sequences, and sequences unique to S. speciosa. We included several other angiosperm species as genome size standards, one of which was the snapdragon (Antirrhinum majus L.; Veronicaceae, Lamiales). Multiple measurements on three accessions indicated that the genome size of A. majus is ~633 × 10? base pairs, which is approximately 40% of the previously published estimate. 相似文献
11.
Significant differences between the G+C content of synonymous codons in orthologous genes and the genomic G+C content 总被引:2,自引:0,他引:2
The relationship between the overall G+C content of the genome (GC) and the GC content at the third codon positions (GC3) of genes, which we refer to as a GC3-plot, was examined using 15 currently available complete genome sequences. A remarkably linear relationship was found between these two quantities, confirming previous observations of a strong positive correlation in the GC3-plot. In order to conduct a more detailed analysis of the GC3-plot, we examined the GC3 content by separating orthologous codons into three categories: synonymously different codons (namely identical amino acids, IA), different amino acids (DA), and identical codons (IC), for a pairwise comparison of two closely related species. When we took pairwise species comparisons between Mycoplasma genitalium (Mg) and Mycoplasma pneumoniae (Mp) and between Mycobacterium tuberculosis (Mt) and Mycobacterium leprae (Ml) as examples, we found that for Mp and Ml, the GC3 for IA deviated the most from the linear expectation in the GC3-plot, whereas for Mg and Mt the deviation was minimal. These findings suggest that the major changes of GC content took place in Mp and Ml, but not in Mg and Mt. This analysis also enables us to predict the future direction of the evolutionary changes of the genomic GC content. 相似文献
12.
Cytofluorometric DNA base determination in vertebrate species with different genome sizes 总被引:1,自引:0,他引:1
T Capriglione E Olmo G Odierna B Improta A Morescalchi 《Basic and applied histochemistry》1987,31(2):119-126
The base composition of DNA was studied in 15 amphibian species and 28 reptile species by means of DAPI, a fluorochrome specific for adenine-thymine rich DNA (AT-rich DNA). The results obtained in reptiles and anuran amphibians coincided with biochemical data available for some species. In urodeles, on the contrary, the findings contrasted with biochemical data and suggest that DAPI is unable to stain all the AT-rich DNA in the erythrocytes of these organisms. It is concluded that the method is suitable for studying species with a small genome size, such as reptiles and anuran amphibians, but is not suitable for nuclei with a large genome size and a highly compact chromatin, such as urodele erythrocytes. 相似文献
13.
J Vieira B Charlesworth 《Proceedings. Biological sciences / The Royal Society》1999,266(1431):1905-1912
Comparisons of polymorphism patterns between distantly related species are essential in order to determine their generality. However, most work on the genus Drosophila has been done only with species of the subgenus Sophophora. In the present work, we have sequenced one intron and surrounding coding sequences of 6 X-linked genes (chorion protein s36, elav, fused, runt, suppressor of sable and zeste) from 21 strains of wild-type Drosophila virilis (subgenus Drosophila). From these data, we have estimated the average level of DNA polymorphism, inferred the effective population size and population structure of this species, and compared the results with those obtained for other Drosophila species. There is no reduction in variation at two loci close to the centromeric heterochromatin, in contrast to Drosophila melanogaster. 相似文献
14.
We conducted a genome-wide analysis of variations in guanine plus cytosine (G+C) content at the third codon position at silent substitution sites of orthologous human and mouse protein-coding nucleotide sequences. Alignments of 3776 human protein-coding DNA sequences with mouse orthologs having >50 synonymous codons were analyzed, and nucleotide substitutions were counted by comparing sequences in the alignments extracted from gap-free regions. The G+C content at silent sites in these pairs of genes showed a strong negative correlation (r = -0.93). Some gene pairs showed significant differences in G+C content at the third codon position at silent substitution sites. For example, human thymine-DNA glycosylase was A+T-rich at the silent substitution sites, while the orthologous mouse sequence was G+C-rich at the corresponding sites. In contrast, human matrix metalloproteinase 23B was G+C-rich at silent substitution sites, while the mouse ortholog was A+T-rich. We discuss possible implications of this significant negative correlation of G+C content at silent sites. 相似文献
15.
Almost one thousand 16S rRNA sequences of Gram-positive bacteria with a low DNA G + C content from public databases were analyzed using the ARB software package. A signature region was identified between positions 354 and 371 (E. coli numbering) for the Bacillus sub-branch of the Gram-positive bacteria with a low DNA G + C content, the former orders Bacillales and Lactobacillales. Three oligonucleotide probes, namely LGC354A, LGC354B, and LGC354C, were designed to target this diagnostic site. Their fluorescent derivatives were suitable for whole cell detection by fluorescence in situ hybridization (FISH). Hybridization conditions were adjusted for differentiation of target and related non-target reference species. When applying FISH to whole bacterial cells in a sample of activated sludge from a communal wastewater treatment plant, members of the Bacillus sub-branch were detected at levels from 0.01% of cells in samples fixed with paraformaldehyde to over 8 percent in the same samples fixed with ethanol and treated with lysozyme. The problems of quantitative in situ analysis of Gram-positive bacteria with a low DNA G + C content in biofilm flocs are discussed and recommendations made. Members of the Bacillus sub-branch were detected in different abundances in activated sludge samples from different wastewater plants. 相似文献
16.
Kruisselbrink E Guryev V Brouwer K Pontier DB Cuppen E Tijsterman M 《Current biology : CB》2008,18(12):900-905
To safeguard genetic integrity, cells have evolved an accurate but not failsafe mechanism of DNA replication. Not all DNA sequences tolerate DNA replication equally well [1]. Also, genomic regions that impose structural barriers to the DNA replication fork are a potential source of genetic instability [2, 3]. Here, we demonstrate that G4 DNA-a sequence motif that folds into quadruplex structures in vitro [4, 5]-is highly mutagenic in vivo and is removed from genomes that lack dog-1, the C. elegans ortholog of mammalian FANCJ [6, 7], which is mutated in Fanconi anemia patients [8-11]. We show that sequences that match the G4 DNA signature G3-5N1-3G3-5N1-3G3-5N1-3G3-5 are deleted in germ and somatic tissues of dog-1 animals. Unbiased aCGH analyses of dog-1 genomes that were allowed to accumulate mutations in >100 replication cycles indicate that deletions are found exclusively at G4 DNA; deletion frequencies can reach 4% per site per animal generation. We found that deletion sizes fall short of Okazaki fragment lengths [12], and no significant microhomology was observed at deletion junctions. The existence of 376,000 potentially mutagenic G4 DNA sites in the human genome could have major implications for the etiology of hereditary FancJ and nonhereditary cancers. 相似文献
17.
Hideo Yamagishi 《Journal of molecular evolution》1974,3(3):239-242
Summary The DNA's ofMicrococcus lysodeikticus andClostridium perfringens were fragmented to about 7 000 nucleotide pairs long by shear and fractionated with respect to buoyant density of mercury complexes in Cs2SO4. The distribution of G + C content in both DNA's was characteristically asymmetric. InM. lysodeikticus DNA, low G + C fragments were more numerous than high G + C fragments, whereas inC. perfringens DNA, high G + C fragments were more numerous than low G + C fragments. The G + C content of fragments ofM. lysodeikticus DNA varied from 70 to 77%, with a mean and standard deviation of 73.7 ± 1.92% G + C and that ofC. perfringens DNA varied from 27 to 34%, with a mean and standard deviation of 29.8 ± 1.34% G + C. The standard deviation was smaller than that ofEscherichia coli DNA fragments of similar size. Biological meanings of relatively low heterogeneity in nucleotide composition inM. lysodeikticus andC. perfringens are discussed. 相似文献
18.
19.
The concept of homogeneity of G+C content is always relative and subjective. This point is emphasized and quantified in this paper using a simple example of one sequence segmented into two subsequences. Whether the sequence is homogeneous or not can be answered by whether the two-subsequence model describes the DNA sequence better than the one-sequence model. There are at least three equivalent ways of looking at the 1-to-2 segmentation: Jensen-Shannon divergence measure, log likelihood ratio test, and model selection using Bayesian information criterion. Once a criterion is chosen, a DNA sequence can be recursively segmented into multiple domains. We use one subjective criterion called segmentation strength based on the Bayesian information criterion. Whether or not a sequence is homogeneous and how many domains it has depend on this criterion. We compare six different genome sequences (yeast S. cerevisiae chromosome III and IV, bacterium M. pneumoniae, human major histocompatibility complex sequence, longest contigs in human chromosome 21 and 22) by recursive segmentations at different strength criteria. Results by recursive segmentation confirm that yeast chromosome IV is more homogeneous than yeast chromosome III, human chromosome 21 is more homogeneous than human chromosome 22, and bacterial genomes may not be homogeneous due to short segments with distinct base compositions. The recursive segmentation also provides a quantitative criterion for identifying isochores in human sequences. Some features of our recursive segmentation, such as the possibility of delineating domain borders accurately, are superior to those of the moving-window approach commonly used in such analyses. 相似文献
20.
Codon usage in the G+C-rich Streptomyces genome. 总被引:45,自引:0,他引:45
The codon usage (CU) patterns of 64 genes from the Gram+ prokaryotic genus Streptomyces were analysed. Despite the extremely high overall G+C content of the Streptomyces genome (estimated at 0.74), individual genes varied in G+C content from 0.610 to 0.797, and had third codon position G+C contents (GC3s) that varied from 0.764 to 0.983. The variation in GC3s explains a significant proportion of the variation in CU patterns. This is consistent with an evolutionary model of the Streptomyces genome where biased mutation pressure has led to a high average G+C content with random variation about the mean, although the variation observed is greater than that expected from a simple binomial model. The only gene in the sample that can be confidently predicted to be highly expressed, EF-Tu of Streptomyces coelicolor A3(2) (GC3s = 0.927), shows a preference for a third position C in several of the four codon families, and for CGY and GGY for Arg and Gly codons, respectively (Y = pyrimidine); similar CU patterns are found in highly expressed genes of the G+C-rich Micrococcus luteus genome. It thus appears that codon usage in Streptomyces is determined predominantly by mutation bias, with weak translational selection operating only in highly expressed genes. We discuss the possible consequences of the extreme codon bias of Streptomyces and consider how it may have evolved. A set of CU tables is provided for use with computer programs that locate protein-coding regions. 相似文献