首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
We have used a polymorphism dataset on introns and coding sequences of X-linked loci in Drosophila americana to estimate the strength of selection on codon usage and/or biased gene conversion (BGC), taking into account a recent population expansion detected by a maximum-likelihood method. Drosophila americana was previously thought to have a stable demographic history, so that this evidence for a recent population expansion means that previous estimates of selection need revision. There was evidence for natural selection or BGC favouring GC over AT variants in introns, which is stronger for GC-rich than GC-poor introns. By comparing introns and coding sequences, we found evidence for selection on codon usage bias, which is much stronger than the forces acting on GC versus AT basepairs in introns.  相似文献   

2.
Schmegner C  Hoegel J  Vogel W  Assum G 《Genetics》2007,175(1):421-428
The human genome is composed of long stretches of DNA with distinct GC contents, called isochores or GC-content domains. A boundary between two GC-content domains in the human NF1 gene region is also a boundary between domains of early- and late-replicating sequences and of regions with high and low recombination frequencies. The perfect conservation of the GC-content distribution in this region between human and mouse demonstrates that GC-content stabilizing forces must act regionally on a fine scale at this locus. To further elucidate the nature of these forces, we report here on the spectrum of human SNPs and base pair substitutions between human and chimpanzee. The results show that the mutation rate changes exactly at the GC-content transition zone from low values in the GC-poor sequences to high values in GC-rich ones. The GC content of the GC-poor sequences can be explained by a bias in favor of GC > AT mutations, whereas the GC content of the GC-rich segment may result from a fixation bias in favor of AT > GC substitutions. This fixation bias may be explained by direct selection by the GC content or by biased gene conversion.  相似文献   

3.
Galtier N  Bazin E  Bierne N 《Genetics》2006,172(1):221-228
The study of base composition evolution in Drosophila has been achieved mostly through the analysis of coding sequences. Third codon position GC content, however, is influenced by both neutral forces (e.g., mutation bias) and natural selection for codon usage optimization. In this article, large data sets of noncoding DNA sequence polymorphism in D. melanogaster and D. simulans were gathered from public databases to try to disentangle these two factors-noncoding sequences are not affected by selection for codon usage. Allele frequency analyses revealed an asymmetric pattern of AT vs. GC noncoding polymorphisms: AT --> GC mutations are less numerous, and tend to segregate at a higher frequency, than GC --> AT ones, especially at GC-rich loci. This is indicative of nonstationary evolution of base composition and/or of GC-biased allele transmission. Fitting population genetics models to the allele frequency spectra confirmed this result and favored the hypothesis of a biased transmission. These results, together with previous reports, suggest that GC-biased gene conversion has influenced base composition evolution in Drosophila and explain the correlation between intron and exon GC content.  相似文献   

4.
The melting of the coding and non-coding classes of natural DNA sequences was investigated using a program, MELTSIM, which simulates DNA melting based upon an empirically parameterized nearest neighbor thermodynamic model. We calculated T(m) results of 8144 natural sequences from 28 eukaryotic organisms of varying F(GC) (mole fraction of G and C) and of 3775 coding and 3297 non-coding sequences derived from those natural sequences. These data demonstrated that the T(m) vs. F(GC) relationships in coding and non-coding DNAs are both linear but have a statistically significant difference (6.6%) in their slopes. These relationships are significantly different from the T(m) vs. F(GC) relationship embodied in the classical Marmur-Schildkraut-Doty (MSD) equation for the intact long natural sequences. By analyzing the simulation results from various base shufflings of the original DNAs and the average nearest neighbor frequencies of those natural sequences across the F(GC) range, we showed that these differences in the T(m) vs. F(GC) relationships are largely a direct result of systematic F(GC)-dependent biases in nearest neighbor frequencies for those two different DNA classes. Those differences in the T(m) vs. F(GC) relationships and biases in nearest neighbor frequencies also appear between the sequences from multicellular and unicellular organisms in the same coding or non-coding classes, albeit of smaller but significant magnitudes.  相似文献   

5.
Codon usage in Clonorchis sinensis was analyzed using 12,515 codons from 38 coding sequences. Total GC content was 49.83%, and GC1, GC2 and GC3 contents were 56.32%, 43.15% and 50.00%, respectively. The effective number of codons converged at 51-53 codons. When plotted against total GC content or GC3, codon usage was distributed in relation to GC3 biases. Relative synonymous codon usage for each codon revealed a single major trend, which was highly correlated with GC content at the third position when codons began with A or U at the first two positions. In codons beginning with G or C base at the first two positions, the G or C base rarely occurred at the third position. These results suggest that codon usage is shaped by a bias towards G or C at the third base, and that this is affected by the first and second bases.  相似文献   

6.
Vanishing GC-rich isochores in mammalian genomes   总被引:25,自引:0,他引:25  
Duret L  Semon M  Piganeau G  Mouchiroud D  Galtier N 《Genetics》2002,162(4):1837-1847
To understand the origin and evolution of isochores-the peculiar spatial distribution of GC content within mammalian genomes-we analyzed the synonymous substitution pattern in coding sequences from closely related species in different mammalian orders. In primate and cetartiodactyls, GC-rich genes are undergoing a large excess of GC --> AT substitutions over AT --> GC substitutions: GC-rich isochores are slowly disappearing from the genome of these two mammalian orders. In rodents, our analyses suggest both a decrease in GC content of GC-rich isochores and an increase in GC-poor isochores, but more data will be necessary to assess the significance of this pattern. These observations question the conclusions of previous works that assumed that base composition was at equilibrium. Analysis of allele frequency in human polymorphism data, however, confirmed that in the GC-rich parts of the genome, GC alleles have a higher probability of fixation than AT alleles. This fixation bias appears not strong enough to overcome the large excess of GC --> AT mutations. Thus, whatever the evolutionary force (neutral or selective) at the origin of GC-rich isochores, this force is no longer effective in mammals. We propose a model based on the biased gene conversion hypothesis that accounts for the origin of GC-rich isochores in the ancestral amniote genome and for their decline in present-day mammals.  相似文献   

7.
The interactions of DAPI with natural DNA and synthetic polymers have been investigated by hydrodynamic, DNase I footprinting, spectroscopic, binding, and kinetic methods. Footprinting results at low ratios (compound to base pair) are similar for DAPI and distamycin. At high ratios, however, GC regions are blocked from enzyme cleavage by DAPI but not by distamycin. Both poly[d(G-C)]2 and poly[d(A-T)]2 induce hypochromism and shifts of the DAPI absorption band to longer wavelengths, but the effects are larger with the GC polymer. NMR shifts of DAPI protons in the presence of excess AT and GC polymers are significantly different, upfield for GC and mixed small shifts for AT. The dissociation rate constants and effects of salt concentration on the rate constants are also quite different for the AT and the GC polymer complexes. The DAPI dissociation rate constant is larger with the GC polymer but is less sensitive to changes in salt concentration than with the AT complex. Binding of DAPI to the GC polymer and to poly[d(A-C)].poly[d(G-T)] exhibits slight negative cooperativity, characteristic of a neighbor-exclusion binding mode. DAPI binding to the AT polymer is unusually strong and exhibits significant positive cooperativity. DAPI has very different effects on the bleomycin-catalyzed cleavage of the AT and GC polymers, a strong inhibition with the AT polymer but enhanced cleavage with the GC polymer. All of these results are consistent with two totally different DNA binding modes for DAPI in regions containing consecutive AT base pairs versus regions containing GC or mixed GC and AT base pair sequences. The binding mode at AT sites has characteristics which are similar to those of the distamycin-AT complex, and all results are consistent with a cooperative, very strong minor groove binding mode. In GC and mixed-sequence regions the results are very similar to those observed with classical intercalators such as ethidium and indicate that DAPI intercalates in DNA sequences which do not contain at least three consecutive AT base pairs.  相似文献   

8.
Experimental estimates of the premelting Adenine-Thymine base pair opening probability for some B-DNA sequences are two orders of magnitude smaller than those of other B-DNA sequences. The AT pairs in the sequence with smaller open probability seem to be those that have a well defined spine of hydration in the minor groove. We show that this spine of hydration can significantly enhance the thermal stability of the base pairs to which they are attached. The effect of this spine of hydration coupled with the possible stabilization effect contributed from neighboring GC pairs can explain the differences in the observed AT pair opening probability for different AT containing B-DNA sequences.  相似文献   

9.
The various nearest neighbor stacking interaction energies of stacked base pairs in the DNA double helix are calculated for both A- and B-type conformations using an ab initio molecular orbital method. It is demonstrated that the sequence-dependent conformational preference for A- or B-type results from the stacking interaction. In particular, the base sequence showing the highest preference for an A-type conformation is revealed as GC/GC, and the one with the next highest preference, AT/AT; for a B-type conformation, the respective sequences are CG/CG and CA/TG. The overall conformation of a DNA fragment is not determined by these particular sequences only but is influenced by all base pair steps. An intrinsically favorable conformation is predicted from the constituent stacking interaction.  相似文献   

10.
A frequently used approach for detecting potential coding regions is to search for stop codons. In the standard genetic code 3 out of 64 trinucleotides are stop codons. Hence, in random or non-coding DNA one can expect every 21st trinucleotide to have the same sequence as a stop codon. In contrast, the open reading frames (ORFs) of most protein-coding genes are considerably longer. Thus, the stop codon frequency in coding sequences deviates from the background frequency of the corresponding trinucleotides. This has been utilized for gene prediction, in particular, in detecting protein-coding ORFs. Traditional methods based on stop codon frequency are based on the assumption that the GC content is about 50%. However, many genomes show significant deviations from that value. With the presented method we can describe the effects of GC content on the selection of appropriate length thresholds of potentially coding ORFs. Conversely, for a given length threshold, we can calculate the probability of observing it in a random sequence. Thus, we can derive the maximum GC content for which ORF length is practicable as a feature for gene prediction methods and the resulting false positive rates. A rough estimate for an upper limit is a GC content of 80%. This estimate can be made more precise by including further parameters and by taking into account start codons as well. We demonstrate the feasibility of this method by applying it to the genomes of the bacteria Rickettsia prowazekii, Escherichia coli and Caulobacter crescentus, exemplifying the effect of GC content variations according to our predictions. We have adapted the method for predicting coding ORFs by stop codon frequency to the case of GC contents different from 50%. Usually, several methods for gene finding need to be combined. Thus, our results concern a specific part within a package of methods. Interestingly, for genomes with low GC content such as that of R. prowazekii, the presented method provides remarkably good results even when applied alone.  相似文献   

11.
Parallel-stranded (ps) DNAs with mixed AT/GC content comprising G.C pairs in a varying sequence context have been investigated. Oligonucleotides were devised consisting of two 10-nt strands complementary either in a parallel or in an antiparallel orientation and joined via nonnucleotide linkers so as to form 10-bp ps or aps hairpins. A predominance of intramolecular hairpins over intermolecular duplexes was achieved by choice of experimental conditions and verified by fluorescence determinations yielding estimations of rotational relaxation times and fractional base pairing. A multistate mode of ps hairpin melting was revealed by temperature gradient gel electrophoresis (TGGE). The thermal stability of the ps hairpins with mixed AT/GC content depends strongly on the specific sequence in a manner peculiar to the ps double helix. The thermodynamic effects of incorporating trans G.C base pairs into an AT sequence are context-dependent: an isolated G. C base pair destabilizes the duplex whereas a block of > or =2 consecutive G.C base pairs exerts a stabilizing effect. A multistate heterogeneous zipper model for the thermal denaturation of the hairpins was derived and used in a global minimization procedure to compute the thermodynamic parameters of the ps hairpins from experimental melting data. In 0.1 M LiCl at 3 degrees C, the formation of a trans G.C pair in a GG/CC sequence context is approximately 3 kJ mol(-)(1) more favorable than the formation of a trans A.T pair in an AT/TA sequence context. However, GC/AT contacts contribute a substantial unfavorable free energy difference of approximately 2 kJ mol(-)(1). As a consequence, the base composition and fractional distribution of isolated and clustered G.C base pairs determine the overall stability of ps-DNA with mixed AT/GC sequences. Thus, the stability of ps-DNA comprising successive > or =2 G.C base pairs is greater than that of ps-DNA with an alternating AT sequence, whereas increasing the number of AT/GC contacts by isolating G.C base pairs exerts a destabilizing effect on the ps duplex. Molecular modeling of the various helices by force field techniques provides insight into the structural basis for these distinctions.  相似文献   

12.
The base catalysed imino proton exchange in DNA oligonucleotides of different sequences and lengths was studied by 1H-NMR saturation recovery experiments. The self-complementary sequences studied were GCGCGAATTCGCGC (I), CGCGAATTCGCG (II), GCGAATTCGC (III), and CGCGATCGCG (IV). The evaluation of base pair lifetimes was made after correction for the measured 'absence of added catalyst' effect which was found to be characterized by recovery times of 400-500 ms for the AT base pairs and 250-300 ms for the GC base pairs at 15 degrees C. End effects with rapid exchange is noticeable up to 3 base pairs from either end of the duplexes. The inner hexamer cores GAATTC of sequences I-II show similar base pair lifetime patterns, around 30 ms for the innermost AT, 5-10 ms for the outer AT and 20-50 ms for the GC base pairs at 15 degrees C. The shorter sequences III and particularly IV show much shorter lifetimes in their central AT base pairs (11 ms and 1 ms, respectively).  相似文献   

13.
14.
Detailed analyses of the sequence-dependent solvation and ion atmosphere of DNA are presented based on molecular dynamics (MD) simulations on all the 136 unique tetranucleotide steps obtained by the ABC consortium using the AMBER suite of programs. Significant sequence effects on solvation and ion localization were observed in these simulations. The results were compared to essentially all known experimental data on the subject. Proximity analysis was employed to highlight the sequence dependent differences in solvation and ion localization properties in the grooves of DNA. Comparison of the MD-calculated DNA structure with canonical A- and B-forms supports the idea that the G/C-rich sequences are closer to canonical A- than B-form structures, while the reverse is true for the poly A sequences, with the exception of the alternating ATAT sequence. Analysis of hydration density maps reveals that the flexibility of solute molecule has a significant effect on the nature of observed hydration. Energetic analysis of solute-solvent interactions based on proximity analysis of solvent reveals that the GC or CG base pairs interact more strongly with water molecules in the minor groove of DNA that the AT or TA base pairs, while the interactions of the AT or TA pairs in the major groove are stronger than those of the GC or CG pairs. Computation of solvent-accessible surface area of the nucleotide units in the simulated trajectories reveals that the similarity with results derived from analysis of a database of crystallographic structures is excellent. The MD trajectories tend to follow Manning's counterion condensation theory, presenting a region of condensed counterions within a radius of about 17 A from the DNA surface independent of sequence. The GC and CG pairs tend to associate with cations in the major groove of the DNA structure to a greater extent than the AT and TA pairs. Cation association is more frequent in the minor groove of AT than the GC pairs. In general, the observed water and ion atmosphere around the DNA sequences is the MD simulation is in good agreement with experimental observations.  相似文献   

15.
By means of restriction enzymes analysis and molecular hybridization, the distribution of repeated DNA families has been studied in the different DNA components into which the human genome can be fractionated by density gradient techniques. Three classes of DNA molecules have been analyzed: i) an homogeneous DNA component (satellite-like sequences; Q = 1.696 g/cm3, 3% of total DNA, AT repeated), ii) AT rich (Q = 1.698 g/cm3, 30% of total DNA, AT main-band) and GC rich (Q = 1.708 g/cm3, 6% of total DNA, GC main-band) DNA components. By this approach we have observed that Sau3A digestion of GC main-band gives rise to two bands of 75bp and 150bp, absent or under-represented in both AT rich DNA components. A preliminary characterization of these DNA fragments suggests that they contain one or more families of repeated sequences which fail to hybridize to EcoRI, HindIII and AluI families of repeats. In addition, we have observed that EcoRI sequences (alpha-RI DNA) are under-represented in GC main-band and show the same clustered organization in both AT rich DNA components.  相似文献   

16.
We have investigated the mitochondrial genome of eight ori-zero spontaneous petite mutants of Saccharomyces cerevisiae. The tandem repeat units of these genomes do not contain any of the seven canonical ori sequences of the wild-type genome. Instead, they contain one, or more, ori-S sequences. These 44-nucleotide long surrogate origins of replication are a subset of GC clusters characterized by a potential secondary fold with two sequences ATAG and GGAG , inserted in AT spacers, two AT base pairs just following them, a GC stem (broken in the middle, and, in most cases also near the base, by non-paired nucleotides), and a terminal loop. This structure is reminiscent of that of GC clusters A and B from canonical ori sequences and supports the view (Bernardi, 1982a ) that the GC clusters of the mitochondrial genome arose, by an expansion process, from the canonical ori sequences. Like the latter, ori-S sequences are present in both orientations, are located in intergenic regions, and can be used as excision sequences when tandemly oriented. Again as in the case of canonical ori sequences, the density of ori-S sequences on the repeat units of petite genomes are correlated with the replication efficiency of the latter, as assessed by the outcome of crosses with wild-type or petite tester strains.  相似文献   

17.
Structure, variability, and molecular evolution of the trnT-F region in the Bryophyta (mosses and liverworts) is analyzed based on about 200 sequences of the trnT-L spacer and trnL 5' exon, 1000 sequences of the trnL intron, and 800 sequences of the trnL 3' exon and trnL-F spacer, including comparisons of lengths, GC contents, sequence similarities, and functional elements. Mutations occurring in the trnL 5' and 3' exons, including compensatory base pair changes, and a transition in the trnL anticodon in Takakia lepidozioides, are discussed. All three non-coding regions display a mosaic structure of highly variable elements (V1 - V3 in the trnT-L spacer, V4/V5 corresponding to stem-loop regions P6/P8 in the trnL intron, and V6/V7 in the trnL-F spacer) and more conserved elements. In the trnL intron this structure is a consequence of the defined secondary structure necessary for correct splicing, whereas in both spacers conserved regions are restricted to promoter elements. At least the highly variable regions in the trnT-L spacer and stem-loop region P8 of the trnL intron seem to evolve independently in the major bryophyte lineages and are therefore not suitable for high taxonomic level phylogenetic reconstructions. In mosses, a trend of length reduction towards the more derived lineages is observed in all three non-coding regions. GC contents are mostly linked to sequence variability, with the conserved regions being more GC rich and the more variable AT rich. The lowest GC values (< 10 %) are found in the trnT-L spacer of mosses. In addition to two putative sigma (70)-type promoters in the trnT-L spacer, a third putative promoter is present in the trnL-F spacer, although trnL and trnF are assumed to be co-transcribed. Consensus sequences are provided for the -35 and -10 sequences of the major bryophyte lineages. The third promoter is part of a hairpin secondary structure, whose loop region is highly homoplastic in mosses due to an inversion occurring independently in non-related taxa, even at the intraspecific level.  相似文献   

18.
With the increasing number and variations of genome sequences available, control of gene expression with synthetic, cell-permeable molecules is within reach. The variety of sequence-specific binding agents is, however, still quite limited. Many minor groove binding agents selectivity recognize AT over GC sequences but have less ability to distinguish among different AT sequences. The goal with this article is to develop compounds that can bind selectively to different AT sequences. A number of studies indicate that AATT and TTAA sequences have significantly different physical and interaction properties and different requirements for minor groove recognition. Although it has been difficult to get minor groove binding at TTAA, DB293, a phenyl-furan-benzimidazole diamidine, was found to bind as a strong, cooperative dimer at TTAA but with no selectivity over AATT. In order to improve selectivity, we made modifications to each unit of DB293. Binding affinities and stoichiometries obtained from biosensor-surface plasmon resonance experiments show that DB1003, a furan-furan-benzimidazole diamidine, binds strongly to TTAA as a dimer and has selectivity (KTTAA/KAATT = 6). CD and DNase I footprinting studies confirmed the preference of this compound for TTAA. In summary, (i) a favorable stacking surface provided by the pi system, (ii) H-bond donors to interact with TA base pairs at the floor of the groove provided by a benzimidazole (or indole) -NH and amidines, and (iii) appropriate curvature of the dimer complex to match the curvature of the minor groove play important roles in differentiating the TTAA and AATT minor grooves.  相似文献   

19.
We have analyzed the effect of base composition at the center of symmetry of inverted repeated DNA sequences on cruciform transitions in supercoiled DNA. For this we have constructed two series of palindromic DNA sequences: one set with differing center and one set with differing center and arm sequences. The F series consists of two 96-base pair perfect inverted repeats which are identical except for the central 10 base pairs which consist of pure AT or GC base pairs. The S series was constructed such that the overall base composition of the inverted repeats was identical but in which the positioning of blocks of AT- and GC-rich sequences varied. The rate of cruciform formation for the inverted repeats in plasmid pUC8 was dramatically influenced by the 8-10 base pairs at the center of the inverted repeat. Inverted repeats with 8-10 AT base pairs in the center were kinetically much more active in cruciform formation than inverted repeats with 8-10 GC base pairs in the center. These experiments show a dominant influence of the center sequences of inverted repeats on the rate of cruciform formation.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号