首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Previous studies of the dinucleotides flanking both the 5' and 3' ends of homooligomer tracts have shown that some flanks are consistently preferred over others (1,2). In the first preferred group, the homooligomer tracts are flanked by the same nucleotide and/or the complementary nucleotides, e.g.,ATAn,TTAn,CCGn, where n = 2-5. Runs flanked by nucleotides with which they cannot base pair are distinctly disfavored. (In this group An/Tn are flanked by C and/or G; Gn/Cn are flanked by A/T, e.g.,CGAn,TnGG,GnAT). The frequencies of runs flanked by A or T, and G or C ("mixed"group) are as expected. Here we seek the origin of this effect and its relevance to protein-DNA interactions. Surprisingly, within the first group, runs flanked by their complements with a pyrimidine-purine junction (e.g.,TTAn,CnGG) are greatly preferred. The frequencies of their purine-pyrimidine junction mirror-images is just as expected. This effect, as well as additional ones enumerated below, is seen universally in eukaryotes and in prokaryotes, although it is stronger in the former. Detailed analysis of regulatory regions shows these strong trends, particularly in GC sequences. The potential relationship to DNA conformation and DNA-protein interaction is discussed.  相似文献   

2.
Abstract

Previous studies of the dinucleotides flanking both the 5′ and 3′ ends of homooligomer tracts have shown that some flanks are consistently preferred over others (1,2). In the first preferred group, the homooligomer tracts are flanked by the same nucleotide and/or the complementary nucleotides, e.g., ATAn, TTAn, CCGn, where n=2–5. Runs flanked by nucleotides with which they cannot base pair are distinctly disfavored. (In this group A/Tn are flanked by C and/or G; Gn/Cn are flanked by A/T, e.g., CGAn, TnGG, G., AT). The frequencies of runs flanked by AorT, and G or C (“mixed” group) are as expected. Here we seek the origin of this effect and its relevance to protein-DNA interactions. Surprisingly, within the first group, runs flanked by their complements with a pyrimidine-purine junction (e.g., TTAn, CnGG) are greatly preferred. The frequencies of their purine-pyrimidine junction mirror-images is just as expected. This effect, as well as additional ones enumerated below, is seen universally in eukaryotes and in prokaryotes, although it is stronger in the former. Detailed analysis of regulatory regions shows these strong trends, particularly in GC sequences. The potential relationship to DNA conformation and DNA-protein interaction is discussed.  相似文献   

3.
Summary The eukaryotic and prokaryotic databases are scanned for potential nearest-neighbor doublet preferences at the 5 and 3 flanks of some oligomers. Here we focus on oligomers containing alternating nucleotides, i.e., UV, UVUV, and UUVV where UV. Strong, consistent trends are observed in eukaryotic sequences. A/T alternation oligomers are preferentially flanked by A/T. G/C flanks are disfavored. G/C alternation oligomers are preferentially flanked by G/C. A/T flanks are disfavored. These trends are consistent with those observed previously for homooligomer tracts (Nussinov et al. 1989a,b). G/C tracts are preferentially flanked by G/C. A/T nearest neighbors are disfavored. The reverse holds for A/T tracts. Additional patterns are described here as well. The possible origin of these DNA composition and sequence trends is discussed. These trends are suggested to stem from protein-DNA interaction constraints.  相似文献   

4.
Recent studies of homooligomer tracts suggest different characteristics from random sequence DNA (dA).(dT) and (dG).(dC) tracts are frequent in upstream regions and in some cases have been shown to be essential for regulation. Here we examine homooligomer occurrences in non-coding and coding eukaryotic sequences, focusing on the context in which the homooligomers occur. This analysis of sequences in the junction areas yields distinct and consistent characteristics. In particular, the nucleotide interrupting a run is most frequently complementary to the run. The base next to it is most frequently identical to the one constituting the run. For A or T runs the least frequent nearest and next to nearest neighbors are G or C. For G or C tracts the least frequent are A or T. Complementary oligomers behave similarly. These and additional trends are strongest for run lengths greater than or equal to 3. The computations are carried out on the whole eukaryotic database of greater than 4 x 10(6) nucleotides, separately for coding and non-coding regions. These same trends are evident for both groups, but are somewhat stronger for the non-coding regions. The context in which the homooligomers occur may yield some clues to DNA conformation and its biological implications.  相似文献   

5.
Here, we study the frequencies of occurrence of homooligomers flanked by one base, XnU or UXn, where X = A, C, G, T and U not equal to X. Specifically, we search for preferences (or discriminations) in their nearest neighbor doublet, VV. Extensive analysis of the data base reveals striking patterns in such VVUXn or UXn VV oligomers (V = A, C, G, T). With very few exceptions, if the VV and Xn are composed of complementary nucleotides, those oligomers having a pyrimidine (Y)-purine (R) junction are preferred over those with an RY one. If the VV and Xn nucleotides are not complementary, the RY junction oligomers are preferred over their YR counterparts. These trends are observed consistently in eukaryotic and prokaryotic sequences. They are particularly striking in the YR greater than RY oligomers containing complementary nucleotides. The general preferences and discriminations described here are in the same direction as our previous results for homooligomer tracts. These recurrences, along with some additional universal "rules", aid in our understanding of the ordering of nucleotides in the DNA.  相似文献   

6.
Rapoport AE  Trifonov EN 《Gene》2011,488(1-2):41-45
Linguistic (word count) analysis of prokaryotic genome sequences, by Shannon N-gram extension, reveals that the dominant hidden motifs in A+T rich genomes are T(A)(T)A and G(A)(T)C with uncertain number of repeating A and T. Since prokaryotic sequences are largely protein-coding, the motifs would correspond to amphipathic alpha-helices with alternating lysine and phenylalanine as preferential polar and non-polar residues. The motifs are also known in eukaryotes, as nucleosome positioning patterns. Their existence in prokaryotes as well may serve for binding of histone-like proteins to DNA. In this case the above patterns in prokaryotes may be considered as "anticipated" nucleosome positioning patterns which, quite likely, existed in prokaryotic genomes before the evolutionary separation between eukaryotes and prokaryotes.  相似文献   

7.
The frequencies of occurrence of the 5' and 3' nearest neighbor doublets of oligonucleotides containing (G/C) and (A/T) blocks show strong trends. Specifically, the following trends are observed. Given a (G/C)n (A/T)m oligomer (where G/C)n indicates a sequence of length n composed solely of Gs and/or Cs and (A/T)m is a sequence of length m composed solely of As and/or Ts, and n = 3,2,1; m = 1,2,3) and a (G/mC)2 doublet, (G/C)n (A/T)m (G/C)2 greater than (G/C)n + 2 (A/T)m. That is the (G/C)2 doublet is preferentially located 3' of the oligomer, enclosing the (A/T)m stretch. The trends are strongest for n = 3, m = 1 and gradually weaken as the size of the (mG/C)n block decreases (with a concomitant increase of (A/T)m). (A/T)2 nearest neighbor flank preferentially encloses the (G/C)n block (to produce (A/T)2 (G/C)n (A/T)m). The (A/T)2 flank trends are weaker than the (G/C)2 flank ones. The (A/T)2 flank trends also decrease in strength as the size of the (G/C)n block decreases. The statistical significance of these trends in eukaryotes is very high. A possible correlation with DNA structural parameters, in particular groove geometry, is discussed.  相似文献   

8.
Studies of sequence context preferences of oligonucleotides composed of (G/C)n and (A/T)m blocks (n + m = 3,4,5) unravel strong patterns. Comparisons of the 5' and 3' nearest neighbor doublets flanking these oligomers reveal the preference of (G/C)2 to be positioned immediately next to the (A/T)m block, enclosing it by (G/C) nucleotides rather than extending the (G/C)n block. That is, for a (G/C)n(A/T)m oligomer and a (G/C)2 doublet, (G/C)n(A/T)m(G/C)2 greater than (G/C)n + 2 (A/T)m. Similarly for an (A/T)m(G/C)n oligomer, (G/C)2(A/T)m(G/C)n greater than (A/T)m(G/C)n + 2. In an analogous manner, (A/T)2 flanking doublets prefer enclosing the (G/C)n blocks, although these patterns are weaker. Here we show a strong, direct relationship between the magnitude of the trends and the presence of Cs in the (G/C)n block in the (G/C)n(A/T)m oligomer, and the presence of Gs in the complementary (A/T)m(G/C)n oligomers. The trends are stronger in eukaryotic than in prokaryotic sequences. They are stronger for longer (G/C)n and shorter (A/T)m blocks. We suggest that the preference for (A/T)m to be enclosed by (G/C) rather than be flanked by them on only one side is related to DNA structure and DNA-protein interaction. Sequences of the (G/C)(A/T)(G/C) type may have more homogeneous minor groove geometry. In particular, the strong G vs. C asymmetry in the trends may be related to pyrimidine-purine junctions, possibly to CG sequences.  相似文献   

9.
A statistical analysis of occurrence of particular nucleotide runs (1 divided by 10 nucleotides long) in DNA sequences of different species has been carried out. There are considerable differences in run distributions in DNA sequences of prokaryotes, invertebrates and vertebrates. Distribution of various types of runs has been found to be different in coding and non-coding sequences. There is an abundance of short runs 1 divided by 2 nucleotides long in coding sequences, and there is a deficiency of such runs in the non-coding regions. However, some interesting exceptions from this rule exist: for run distribution of adenine in prokaryotes and for distribution of purine-pyrimidine runs in eukaryotes. This may be stipulated by the fact that the distribution of runs are predetermined by structural peculiarities of the entire DNA molecule. Runs of guanine or cytosine of three to six nucleotides long occur predominantly in the non-coding DNA regions in eukaryotes, especially in vertebrates.  相似文献   

10.
11.
M J Behe  A M Beasty 《DNA sequence》1991,1(5):291-302
Large variations in DNA base composition and noticeable strand asymmetries are known to occur between different organisms and within different regions of the genomes of single organisms. Apparently such composition and sequence biases occur to fulfill structural rather than informational requirements. Here we report the wide occurrence of a more subtle biasing of DNA sequence that can have structural consequences: an increase or a suppression of the number of long tracts of two-base co-polymers. Strong biases were observed when the DNA sequences of the longest eukaryotic, prokaryotic, and organellar entries in the GenBank data base (totaling 773 kilobases) were analyzed for the number of occurrences of tracts of the two-base co-polymers (A,T)n, (G,C)n, and (A,C)n as a function of tract length. (The expression (A,T)n is used here to denote an uninterrupted tract, n nucleotides in length, of A and T bases in any proportion or order, terminated at each end by a G or C residue.) Characteristic differences are also observed in tract biases of eukaryotic vs. prokaryotic organisms.  相似文献   

12.
M K Dosanjh  G Galeros  M F Goodman  B Singer 《Biochemistry》1991,30(49):11595-11599
The frequency of extending m6G.C or m6G.T pairs, when the 3' and 5' flanking neighbors of m6G are either cytosines or thymines, was investigated using primed 25-base-long oligonucleotides and the Klenow fragment of Escherichia coli DNA polymerase I (Kf). The efficiency, Vmax/Km, of extension to the following normal base pair was up to 40-fold greater than for the formation of the m6G.T or m6G.C pair. The frequencies of inserting either dCMP or dTMP opposite these m6G bases did not appear to be different in the two sequences, C-m6G-C and T-m6G-T, but extension was favored in the C-m6G-C sequence. The m6G.T pair extended to a C.G pair most efficiently, indicating that it was not a strong block to continued replication past the template lesion. Thus, m6G.T flanked by cytosines replicates more readily than when flanked by thymines, increasing G----A transitions. These data lend further support to the importance of sequence context in mutagenesis.  相似文献   

13.
Species-specific patterns of DNA bending and sequence.   总被引:16,自引:6,他引:10       下载免费PDF全文
Nucleotide sequences in the GenEMBL database were analyzed using strategies designed to reveal species-specific patterns of DNA bending and DNA sequence. The results uncovered striking species-dependent patterns of bending with more variations among individual organisms than between prokaryotes and eukaryotes. The frequency of bent sites in sequences from different bacteria was related to genomic A + T content and this relationship was confirmed by electrophoretic analysis of genomic DNA. However, base composition was not an accurate predictor for DNA bending in eukaryotes. Sequences from C. elegans exhibited the highest frequency of bent sites in the database and the RNA polymerase II locus from the nematode was the most bent gene in GenEMBL. Bent DNA extended throughout most introns and gene flanking segments from C.elegans while exon regions lacked A-tract bending characteristics. Independent evidence for the strong bending character of this genome was provided by electrophoretic studies which revealed that a large number of the fragments from C.elegans DNA exhibited anomalous gel mobilities when compared to genomic fragments from over 20 other organisms. The prevalence of bent sites in this genome enabled us to detect selectively C.elegans sequences in a computer search of the database using as probes C.elegans introns, bending elements, and a 20 nucleotide consensus sequence for bent DNA. This approach was also used to provide additional examples of species-specific sequence patterns in eukaryotes where it was shown that (A) greater than or equal to 10 and (A.T) greater than or equal to 5 tracts are prevalent throughout the untranslated DNA of D.discodium and P.falciparum, respectively. These results provide new insight into the organization of eukaryotic DNA because they show that species-specific patterns of simple sequences are found in introns and in other untranslated regions of the genome.  相似文献   

14.
Inverted repeated DNA sequences are common in both prokaryotes and eukaryotes. We found that a plasmid-borne 94 base-pair inverted repeat (a perfect palindrome of 47 bp) containing a poly GT sequence is unstable in S. cerevisiae, with a minimal deletion frequency of about 10(-4)/mitotic division. Ten independent deletions had identical end points. Sequence analysis indicated that all deletions were the result of a DNA polymerase slippage event (or a recombination event) involving a 5-bp repeat (5' CGACG 3') that flanked the inverted repeat. The deletion rate and the types of deletions were unaffected by the rad52 mutation. Strains with the pms1 mutation had a 10-fold elevated frequency of instability of the inverted repeat. The types of sequence alterations observed in the pms1 background, however, were different than those seen in either the wild-type or rad52 genetic backgrounds.  相似文献   

15.
Abstract

The frequencies of occurrence of the 5′ and 3′ nearest neighbor doublets of oligonucleotides containing (G/C) and (A/T) blocks show strong trends. Specifically, the following trends are observed. Given a (G/C)n (A/T)m oligomer (where G/C)n indicates a sequence of length n composed solely of Gs and/or Cs and (A/T)mis a sequence of length m composed solely of As and/or Ts, and n=3,2,1; m = 1,2,3) and a (G/C)2 doublet, (G/C)n (A/T)m (G/C)n > (G/C)n+2 (A/T)m- That is the (G/C)2 doublet is preferentially located 3′ of the oligomer, enclosing the (A/T)m stretch. The trends are strongest for n=3, m= 1 and gradually weaken as the size of the (G/C)n block decreases (with a concomitant increase of (A/T)m). (A/T)2 nearest neighbor flank preferentially encloses the (G/C)n block (to produce (A/T) 2(G/C) n (A/T)m). The (A/T)2 flank trends are weaker than the (G/C)2 flank ones. The (A/T)2 flank trends also decrease in strength as the size of the (G/C)n block decreases. The statistical significance of these trends in eukaryotes is very high. A possible correlation with DNA structural parameters, in particular groove geometry, is discussed.  相似文献   

16.
We have studied the statistical constraints on synonymous codon choice to evaluate various proposals regarding the origin of the bias in synonymous codon usage observed by Fiers et al. (1975), Air et al. (1976), Grantham et al. (1980) and others. We have determined the statistical dependence of the degenerate third base on either of its nearest neighbors in mitochondrial, prokaryotic, and eukaryotic coding sequences. We noted an increasing dependence of the third base on its nearest neighbors in moving from mitochrondria to prokaryotes to eukaryotes.A statistical model assuming random equiprobable selection of synonymous codons was found grossly adequate for the mitochondria, but totally indequate for prokaryotes and eukaryotes. A model assuming selection of synonymous codons reflecting a genomic strategy, i.e. the genome hypothesis of Grantham et al. (1980), gave a good approximation of the mitochondrial sequences. A statistical model which exactly maintains codon frequency, but allows the position of corresponding synonymous codons to vary was only grossly adequate for prokaryotes and totally inadequate for eukaryotes. The results of these simulations are consistent with the measures on experimental sequences and suggest that a “frequency constraint” model such as that of Grantham et al. (1980) may be an adequate explanation of the codon usage in mitochondria. However, in addition to this frequency constraint, there may be constraints on synonymous codon choice in prokaryotes due to codon context. Furthermore, any proposal to explain codon usage in eukaryotes must involve a constraint on the context of a codon in the sequence.  相似文献   

17.
Mobile genetic elements: the agents of open source evolution   总被引:1,自引:0,他引:1  
Horizontal genomics is a new field in prokaryotic biology that is focused on the analysis of DNA sequences in prokaryotic chromosomes that seem to have originated from other prokaryotes or eukaryotes. However, it is equally important to understand the agents that effect DNA movement: plasmids, bacteriophages and transposons. Although these agents occur in all prokaryotes, comprehensive genomics of the prokaryotic mobile gene pool or 'mobilome' lags behind other genomics initiatives owing to challenges that are distinct from cellular chromosomal analysis. Recent work shows promise of improved mobile genetic element (MGE) genomics and consequent opportunities to take advantage - and avoid the dangers - of these 'natural genetic engineers'. This review describes MGEs, their properties that are important in horizontal gene transfer, and current opportunities to advance MGE genomics.  相似文献   

18.
R Shah  R Cosstick    S C West 《The EMBO journal》1997,16(6):1464-1472
The Escherichia coli RuvC protein resolves DNA intermediates produced during genetic recombination. In vitro, RuvC binds specifically to Holliday junctions and resolves them by the introduction of nicks into two strands of like polarity. In contrast to junction recognition, which occurs without regard for DNA sequence, resolution occurs preferentially at sequences that exhibit the consensus 5'-(A/T)TT/(G/C)-3' (where / indicates the site of incision). Synthetic Holliday junctions containing modified cleavage sequences have been used to investigate the mechanism of cleavage. The results indicate that specific DNA sequences are required for the correct docking of DNA into the two active sites of the RuvC dimer. In addition, using chemically modified oligonucleotides to introduce a hydrolysis-resistant 3'-S-phosphorothiolate linkage at the cleavage site, it was found that, as long as the sequence requirements are fulfilled, the two incisions could be uncoupled from each other. These results indicate that RuvC protein resolves Holliday junctions by a mechanism similar to that exhibited by certain restriction enzymes.  相似文献   

19.
Two dipeptides, each containing a lysyl residue, were disubstituted with chlorambucil (CLB) and 2,6-dimethoxyhydroquinone-3-mercaptoacetic acid (DMQ-MA): DMQ-MA-Lys(CLB)-Gly-NH2 (DM-KCG) and DMQ-MA-beta-Ala-Lys(CLB)-NH2 (DM-BKC). These peptide-drug conjugates were designed to investigate sequence-specificity of DNA cleavage directed by the proximity effect of the DNA cleavage chromophore (DMQ-MA) situated close to the alkylating agent (CLB) inside a dipeptide moiety. Agarose electrophoresis studies showed that DM-KCG and DM-BKC possess significant DNA nicking activity toward supercoiled DNA whereas CLB and its dipeptide conjugate Boc-Lys(CLB)-Gly-NH2 display little DNA nicking activity. ESR studies of DMQ-MA and DM-KCG both showed five hyperfine signals centered at g = 2.0052 and are assigned to four radical forms at equilibrium, which may give rise to a semiquinone radical responsible for DNA cleavage. Thermal cleavage studies at 90 degrees C on a 265-mer test DNA fragment showed that besides alkylation and cleavage at G residues, reactions with DM-KCG and DM-BKC show a preference for A residues with the sequence pattern: 5'-G-(A)n-Pur-3' > 5'-Pyr-(A)n-Pyr-3' (where n = 2-4). By contrast, DNA alkylation and cleavage by CLB occurs at most G and A residues with less sequence selectivity than seen with DM-KCG and DM-BKC. Thermal cleavage studies using N7-deazaG and N7-deazaA-substituted DNA showed that strong alkylation and cleavage at A residues by DM-KCG and DM-BKC is usually flanked on the 3' side by a G residue whereas strong cleavage at G residues is flanked by at least one purine residue on either the 5' or 3' side. At 65 degrees C, it is notable that the preferred DNA cleavage by DM-KCG and DM-BKC at A residues is significantly more marked than for G residues in the 265-mer DNA; the strongest sites of A-specific reaction occur within the sequences 5'-Pyr-(A)n-Pyr-3'; 5'-Pur-(A)n-G-3' and 5'-Pyr-(A)n-G-3'. In pG4 DNA, cleavage by DM-KCG and DM-BKC is much greater than that by CLB at room temperature and at 65 degrees C. It was also observed that DM-KCG and DM-BKC cleaved at certain pyrimidine residues: C40, T66, C32, T34, and C36. These cleavages were also sequence selective since the susceptible pyrimidine residues were flanked by two purine residues on both the 5' and 3' sides or by a guanine residue on the 5' side. These findings strongly support the proposal that once the drug molecule is positioned so as to permit alkylation by the CLB moiety, the DMQ-MA moiety is held close to the alkylation site, resulting in markedly enhanced sequence-specific cleavage.  相似文献   

20.
It is known that the GT doublet is well conserved at the 5' exon/intron splice junction and is frequently embedded in the AGGT quartet. Although only the underlined G is invariable, splicing and ligation are accurately executed. In this work we search for additional conserved potential signals which may aid in 5' splice site recognition. Extensive searches which are not limited to a preconceived consensus sequence are carried out. We investigate the distributions of the 256 quartets in a 1000 nucleotide span around the 5' splice sites in approximately 1700 eukaryotic nuclear precursor mRNAs. Several potential signals are noted. Of particular interest are quartets containing runs of G, e.g., G4, G3T, G3C, G3A and AG3 in the intron immediately downstream and some C-containing quartets in the exon upstream of the 5' splice site. In an analogous calculation, (A)GGG(A) has also been found to be frequent in the intron, 60 nucleotides upstream and (A)CCC(A) in the exon downstream of the 3' splice site. These results are consistent with the recent indications that exon sequences may play a role in efficient splicing. Some models are proposed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号