首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, a new algorithm is presented, which makes possible multilevel comparison of BLOSUM protein substitution matrices based on data from different groups of organisms. As an example, a comparison between substitution matrices based on data from two groups of bacterial genomes with different GC content is presented. Our approach includes evaluating the number of amino acid pairs in BLOCKS databases created separately for the two groups of bacteria using protein sequences deposited in the COG database. Differences of distributions of amino acid pair counts are tested using the chi-squared based G-test. Different analysis levels make it possible to distinguish different patterns of amino acid substitution. Application of the algorithm reveals statistically significant differences in amino acid substitution patterns between AT-rich and GC-rich groups of bacterial organisms. The differences are particularly visible in the overall substitution pattern, amino acid conservation pattern and in comparison of substitution patterns for single amino acids. The algorithm presented in this paper can be considered a novel method for multi-level comparison of amino acid substitution patterns. The presented approach is not limited to bacterial organisms and BLOSUM substitution matrices. Statistically significant differences between substitution patterns in the two groups of bacterial organisms with respect to amino acid conservation pattern can be the evidence of different rate of evolutionary change between AT-rich and GC-rich bacterial organisms.  相似文献   

2.
The activity of DNA polymerase-associated proofreading 3'-exonucleases is generally enhanced in less stable DNA regions leading to a reduction in base substitution error frequencies in AT- versus GC-rich sequences. Unexpectedly, however, the opposite result was found for Escherichia coli DNA polymerase II (pol II). Nucleotide misincorporation frequencies for pol II were found to be 3-5-fold higher in AT- compared with GC-rich DNA, both in the presence and absence of polymerase processivity subunits, beta dimer and gamma complex. In contrast, E. coli pol III holoenzyme, behaving "as expected," exhibited 3-5-fold lower misincorporation frequencies in AT-rich DNA. A reduction in fidelity in AT-rich regions occurred for pol II despite having an associated 3'-exonuclease proofreading activity that preferentially degrades AT-rich compared with GC-rich DNA primer-template in the absence of DNA synthesis. Concomitant with a reduction in fidelity, pol II polymerization efficiencies were 2-6-fold higher in AT-rich DNA, depending on sequence context. Pol II paradoxical fidelity behavior can be accounted for by the enzyme's preference for forward polymerization in AT-rich sequences. The more efficient polymerization suppresses proofreading thereby causing a significant increase in base substitution error rates in AT-rich regions.  相似文献   

3.
The published nucleotide sequences of the E. coli and S. typhimurium trp A and trp B genes show a high degree of similarity between homologous genes of the two organisms, and an even greater degree of similarity between the amino acid sequences of the gene products. In spite of this, analysis of the nucleotide sequences reveals that there are marked differences between E. coli and S. typhimurium genes with respect to potential frameshift mutation hot-spots and dam and mec, mutationally important, methylation sites. Such existing differences may well lead to divergent evolution of these two, presently closely related, bacteria. Codon usage patterns in the trp A and trp B genes of E. coli and S. typhimurium, and the lac I gene of E. coli, have been re-analysed in terms of AT-rich, GC-rich, neutral, or unique codons and marked preferences found. In some cases particular amino acids are most often specified by AT-rich, in others by the GC-rich, alternative codons. In still other cases the codon preference depends on the gene studied. These patterns can be interpreted in terms of enteric bacterial evolution, via hybridizations, from ancestral bacteria with AT- or GC-rich DNA.  相似文献   

4.
M D Wyatt  M Lee    J A Hartley 《Nucleic acids research》1997,25(12):2359-2364
The covalent sequence specificity of a series of nitrogen mustard and imidazole-containing analogues of distamycin was determined using modified sequencing techniques. The analogues tether benzoic acid mustard (BAM) and possess either one, two or three imidazole units. Examination of the alkylation specificity revealed that BAM produced guanine-N7 lesions in a pattern similar to conventional nitrogen mustards. The monoimidazole-BAM conjugate also produced guanine-N7 alkylation in a similar pattern to BAM, but at a 100-fold lower dose. The diimidazole and triimidazole conjugates did not produce detectable guanine-N7 alkylation but only alkylated at selected sites in the minor groove. Unexpectedly, the alkylation specificity at equivalent doses was nearly identical to that found for the previously reported pyrrole-BAM conjugates. The consensus sequence, 5'-TTTTGPuwas strongly alkylated by the triimidazole conjugate in preference to other similar sites including three occurrences of 5'-TTTTAA. Footprinting studies were carried out to examine the non-covalent DNA binding interactions. These studies revealed that the tripyrrole- BAM conjugate bound non-covalently to the same AT-rich sites as distamycin. In contrast, whereas the Im3lexitropsin bound non-covalently to GC-rich sequences, the triimidazole-BAM conjugate did not detectably footprint to either GC- or AT-rich regions at equivalent doses. The results indicate that the alkylation event is not solely dictated by the non-covalent binding and might be influenced by a unique sequence dependent conformational feature of the consensus sequence 5'-TTTTGPu.  相似文献   

5.
Mitochondrial DNAs of six morphologically different Phytophthora species were digested with 15 restriction enzymes. The numbers of restriction fragments obtained differed considerably from those theoretically expected for random base distribution. Enzymes with relatively many G and C in their recognition sequences produced significantly larger numbers of fragments. Moreover, fragments generated by most of these enzymes were more often shared by two or more species than those from enzymes with more A and T in their recognition sequence. It is concluded that base distribution in mitochondrial DNA of Phytophthora is heterogeneous,AT-rich stretches occurring scattered over the mitochondrial genome and GC-rich regions present in conserved sequences, presumably genes. A practical consequence for taxonomic RFLP studies is that optimal enzymes can be selected, depending on the desired level of resolution.  相似文献   

6.

Introduction

Genomic base composition ranges from less than 25% AT to more than 85% AT in prokaryotes. Since only a small fraction of prokaryotic genomes is not protein coding even a minor change in genomic base composition will induce profound protein changes. We examined how amino acid and codon frequencies were distributed in over 2000 microbial genomes and how these distributions were affected by base compositional changes. In addition, we wanted to know how genome-wide amino acid usage was biased in the different genomes and how changes to base composition and mutations affected this bias. To carry this out, we used a Generalized Additive Mixed-effects Model (GAMM) to explore non-linear associations and strong data dependences in closely related microbes; principal component analysis (PCA) was used to examine genomic amino acid- and codon frequencies, while the concept of relative entropy was used to analyze genomic mutation rates.

Results

We found that genomic amino acid frequencies carried a stronger phylogenetic signal than codon frequencies, but that this signal was weak compared to that of genomic %AT. Further, in contrast to codon usage bias (CUB), amino acid usage bias (AAUB) was differently distributed in AT- and GC-rich genomes in the sense that AT-rich genomes did not prefer specific amino acids over others to the same extent as GC-rich genomes. AAUB was also associated with relative entropy; genomes with low AAUB contained more random mutations as a consequence of relaxed purifying selection than genomes with higher AAUB.

Conclusion

Genomic base composition has a substantial effect on both amino acid- and codon frequencies in bacterial genomes. While phylogeny influenced amino acid usage more in GC-rich genomes, AT-content was driving amino acid usage in AT-rich genomes. We found the GAMM model to be an excellent tool to analyze the genomic data used in this study.  相似文献   

7.
8.
The organization of DNA in the mitotic metaphase and polytene chromosomes of the fungus gnat, Sciara coprophila, has been studied using base-specific DNA ligands, including anti-nucleoside antibodies. The DNA of metaphase and polytene chromosomes reacts with AT-specific probes (quinacrine, DAPI, Hoechst 33258 and anti-adenosine) and to a somewhat lesser extent with GC-specific probes (mithramycin, chromomycin A3 and anticytidine). In virtually every band of the polytene chromosomes chromomycin A3 fluorescence is almost totally quenched by counterstaining with the AT-specific ligand methyl green. This indicates that GC base pairs in most bands are closely interspersed with AT base pairs. The only exceptions are band IV-8A3 and the nucleolus organizer on the X. In contrast, quinacrine and DAPI fluorescence in every band is only slightly quenched by counterstaining with the GC-specific ligand actinomycin D. Thus, each band contains a moderate proportion of AT-rich DNA sequences with few interspersed GC base pairs. — The C-bands in mitotic and polytene chromosomes can be visualized by Giemsa staining after differential extraction of DNA and those in polytene chromosomes by the use of base-specific fluorochromes or antibodies without prior extraction of DNA. C-bands are located in the centromeric region of every chromosome, and the telomeric region of some. The C-bands in the polytene chromosomes contain AT-rich DNA sequences without closely interspered GC base pairs and lack relatively GC-rich sequences. However, one C-band in the centromeric region of chromosome IV contains relatively GC-rich sequences with closely interspersed AT base pairs. — C-bands make up less than 1% of polytene chromosomes compared to nearly 20% of mitotic metaphase chromosomes. The C-bands in polytene chromosomes are detectable with AT-specific or GC-specific probes while those in metaphase chromosomes are not. Thus, during polytenization there is selective replication of highly AT-rich and relatively GC-rich sequences and underreplication of the remainder of the DNA sequences in the constitutive heterochromatin.  相似文献   

9.
C Zimmer  H Triebel 《Biopolymers》1969,8(5):573-593
Reversible and irreversible conformational changes in the acid-induced denaturation of DNA were studied by spectrophotometric titration, sedimentation, and melting measurements. A GC-rich DNA (72 mole-%) shows complete or partial reversibility of the titration profiles within the pH region of transition from helix to coil, while AT-rich DNA (29 mole-%) is irreversible in its titration behavior at each acid pH below the onset of the transition. The results for GC-rich DNA further indicate distinct differences in the titration behavior, which can be attributed to differences in the frequency of GC clusters along the DNA molecule. Plots of the sedimentation coefficient and the parameter asapp against pH lead to the conclusion that conformational changes occur before the onset of the acid-induced helix–coil transition. These alterations are more pronounced upon protonation of larger GC-rich domains than of smaller ones, as concluded from very marked differences observed in the sedimentation–pH behavior of two GC-rich DNA's. An acid denaturation scheme for a GC-rich DNA segment is suggested. Reversibility of the acid denaturation is explained by the existence of stable, protonated, single GC base pairs in nonprotonated stacked single-stranded domains formed in the acid-induced transition region.  相似文献   

10.
Oligonucleotide usage in archaeal and bacterial genomes can be linked to a number of properties, including codon usage (trinucleotides), DNA base-stacking energy (dinucleotides), and DNA structural conformation (di- to tetranucleotides). We wanted to assess the statistical information potential of different DNA ‘word-sizes’ and explore how oligonucleotide frequencies differ in coding and non-coding regions. In addition, we used oligonucleotide frequencies to investigate DNA composition and how DNA sequence patterns change within and between prokaryotic organisms. Among the results found was that prokaryotic chromosomes can be described by hexanucleotide frequencies, suggesting that prokaryotic DNA is predominantly short range correlated, i.e., information in prokaryotic genomes is encoded in short oligonucleotides. Oligonucleotide usage varied more within AT-rich and host-associated genomes than in GC-rich and free-living genomes, and this variation was mainly located in non-coding regions. Bias (selectional pressure) in tetranucleotide usage correlated with GC content, and coding regions were more biased than non-coding regions. Non-coding regions were also found to be approximately 5.5% more AT-rich than coding regions, on average, in the 402 chromosomes examined. Pronounced DNA compositional differences were found both within and between AT-rich and GC-rich genomes. GC-rich genomes were more similar and biased in terms of tetranucleotide usage in non-coding regions than AT-rich genomes. The differences found between AT-rich and GC-rich genomes may possibly be attributed to lifestyle, since tetranucleotide usage within host-associated bacteria was, on average, more dissimilar and less biased than free-living archaea and bacteria.  相似文献   

11.
Chen LL  Gao F 《The FEBS journal》2005,272(13):3328-3336
Eukaryotic genomes are composed of isochores, i.e. long sequences relatively homogeneous in GC content. In this paper, the isochore structure of Arabidopsis thaliana genome has been studied using a windowless technique based on the Z curve method and intuitive curves are drawn for all the five chromosomes. Using these curves, we can calculate the GC content at any resolution, even at the base level. It is observed that all the five chromosomes are composed of several GC-rich and AT-rich regions alternatively. Usually, these regions, named 'isochore-like regions', have large fluctuations in the GC content. Five isochores with little fluctuations are also observed. Detailed analyses have been performed for these isochores. A GC-rich 'isochore-like region' and a GC-isochore in chromosome II and IV, respectively, are the nucleolar organizer regions (NORs), and genes located in the two regions prefer to use GC-ending codons. Another GC-isochore located in chromosome II is a mitochondrial DNA insertion region, the position and size of this region is precisely predicted by the current method. The amino acid usage and codon preference of genes in this organellar-to-nuclear transfer region show significant difference from other regions. Moreover, the centromeres are located in GC-rich 'isochore-like regions' in all the five chromosomes. The current method can provide a useful tool for analyzing whole genomic sequences of eukaryotes.  相似文献   

12.
13.
To gain deeper insights into principles of cell biology, it is essential to understand how cells reorganize their genomes by chromatin remodeling. We analyzed chromatin remodeling on next generation sequencing data from resting and activated T cells to determine a whole-genome chromatin remodeling landscape. We consider chromatin remodeling in terms of nucleosome repositioning which can be observed most robustly in long nucleosome-free regions (LNFRs) that are occupied by nucleosomes in another cell state. We found that LNFR sequences are either AT-rich or GC-rich, where nucleosome repositioning was observed much more prominently in GC-rich LNFRs — a considerable proportion of them outside promoter regions. Using support vector machines with string kernels, we identified a GC-rich DNA sequence pattern indicating loci of nucleosome repositioning in resting T cells. This pattern appears to be also typical for CpG islands. We found out that nucleosome repositioning in GC-rich LNFRs is indeed associated with CpG islands and with binding sites of the CpG-island-binding ZF-CXXC proteins KDM2A and CFP1. That this association occurs prominently inside and also prominently outside of promoter regions hints at a mechanism governing nucleosome repositioning that acts on a whole-genome scale.  相似文献   

14.
15.
Prior studies on subfractions of mouse and Kangaroo rat DNA have suggested that variations in base concentration within a given genome may not be great enough to account for Q-banding. To examine this with another species, calf DNA was subfractionated by CsCl ultracentrifugation into GC-rich satellites and the main band DNA was further fractionated into AT-rich, intermediate and GC-rich portions. The effect of varying concentrations of these DNAs on quinacrine and Hoechst 33258 fluorescence was examined. Although with both compounds there was less fluorescence in the presence of the GC-rich satellites than main band fractions, these results per se did not answer the question of whether the variation in base composition alone was adequate to account for chromosome banding. To answer this the fluorescence observed in the presence of DNA of a given base composition was related to the fluorescence observed in the presence of DNA of 40% GC content (F/F40). This allowed the derivation of a term B which indicated the relative change in fluorescence per 1% change in base composition of DNA. To determine the percent change in fluorescence observed in Q-banding, the photoelectric recordings of Caspersson et al. (1971) were used. From these data we conclude: 1. Quinacrine is twice as sensitive to changes in base composition as Hoechst 33258. 2. Variation in the base content of DNA along the chromosome is sufficient to account for most Q-banding, except possibly for some of the extremes of quinacrine fluorescence. This was further examined with daunomycin. Even though daunomycin gives good fluorescent banding, DNAs varying in base composition from 100 to 40% GC content all resulted in the same relative fluorescence of 0.03. However, in the presence of poly (dA-dT) the relative fluorescence was 0.85, indicating a great sensitivity to very AT-rich DNA. This suggests that with daunomycin and possibly other fluorochromes, stretches of very AT-rich DNA may be more important in fluorescent banding than simple variation in mean base composition.  相似文献   

16.
The likely consequences, in terms of premature stop codons, detectable missense mutants, silent missense mutants, and degenerate codon changes, have been determined for all 12 individual base substitution changes. This has been done for the full, 61 sense codon, genetic code and also for the much more limited codon availabilities of AT- or GC-rich DNA. The specificities and outcomes of individual base substitutions are likely to be rather different at AT- or GC-rich extremes, and also from the situation at an intermediate DNA base-ratio where all 61 sense codons are available. In particular, at DNA base-ratio extremes many mutations will be to non-utilized codons, which may well act as nonsense mutants. These in turn will give novel classes of suppressor-containing revertants. Even in bacteria with intermediate DNA base-ratios, particular codons for a given amino acid may be favoured, over alternatives, because their use maximizes, or minimizes, the mutational consequences of one, or more, base substitution changes.  相似文献   

17.
We used intensity and fluorescence lifetime microscopy (FLIM) of 3T3 nuclei to investigate the existence of AT-rich and GC-rich regions of the nuclear DNA. Hoechst 33258 (Ho) and 7-aminoactinomycin D (7-AAD) were used as fluorescence probes specific for AT and GC base pairs, respectively. YOYO-1 (Yo) was used as a dye that displays distinct fluorescence lifetimes when bound to AT or GC base pairs. We combined fluorescence imaging of Ho and 7-AAD with time-resolved measurements of Yo and took advantage of an additional information content of the time-resolved fluorescence. Because a single nucleus could not be stained and measured with all three dyes, we used texture analysis to compare the spatial distribution of AT-rich and GC-rich DNA in 100 nuclei in different phases of the cell cycle. The fluorescence intensity-based analysis of Ho- or 7-AAD-stained images indicates increased number and larger size of the DNA condensation centers in the G2/M-phases compared to G0/1-phases. The lifetime-based study of Yo-stained images suggests spatial separation of the AT- or GC-rich DNA regions in the G2/M-phase. Texture analysis of fluorescence intensity and lifetime images was used to quantitatively study the spatial change of condensation and separation of AT- and GC-rich DNA during the cell cycle.  相似文献   

18.
Base compositions were examined at every position in codons of more than 50 genes from taxonomically different bacteria and of the corresponding antisense sequences on the bacterial genes. We propose that the nonstop frame on antisense strand [NSF(a)] of GC-rich bacterial genes is the most promising sequence for newly-born genes. Reasons are: (i) NSF(a) frequently appears on the antisense strand of GC-rich bacterial genes; (ii) base compositions at three positions in the codon are nearly symmetrical between the gene having around 55% GC content and the corresponding NSF(a); (iii) amino acid compositions of actual proteins are also similar to those of hypothetical proteins from the GC-rich NSF(a); and (iv) proteins from NSF(a) of 60% or more GC content are flexible enough to adapt to various molecules encountered as novel substrates, due to the high glycine content. To support our proposition, using a computer we generated hypothetical antisense sequences with the same base compositions as of NSF(a) at each base position in the codon, and examined properties of resulting proteins encoded by the imaginary genes. It was confirmed that NSF(a) of GC-rich gene carrying about 60% GC content is competent enough for a newly-born gene.  相似文献   

19.
The advent of full genome sequences provides exceptionally rich data sets to explore molecular and evolutionary mechanisms that shape divergence among and within genomes. In this study, we use multivariate analysis to determine the processes driving genome-wide patterns of amino usage in the obligate endosymbiont Buchnera and its close free-living relative Escherichia coli. In the AT-rich Buchnera genome, the primary source of variation in amino acid usage differentiates high- and low-expression genes. Amino acids of high-expression Buchnera genes are generally less aromatic and use relatively GC-rich codons, suggesting that selection against aromatic amino acids and against amino acids with AT-rich codons is stronger in high-expression genes. Selection to maintain hydrophobic amino acids in integral membrane proteins is a primary factor driving protein evolution in E. coli but is a secondary factor in Buchnera. In E. coli, gene expression is a secondary force driving amino acid usage, and a correlation with tRNA abundance suggests that translational selection contributes to this effect. Although this and previous studies demonstrate that AT mutational bias and genetic drift influence amino acid usage in Buchnera, this genome-wide analysis argues that selection is sufficient to affect the amino acid content of proteins with different expression and hydropathy levels.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号