首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Curved DNA fragments are often found near functionally important sites such as promoters and origins of replication, and hence sequence-dependent DNA curvature prediction is of great utility in genomics and bioinformatics. In light of this, an assessment of three different dinucleotide step parameters (based on gel retardation as well as crystal structure data) is carried out. These parameters (BMHT, LB and CS) are evaluated quantitatively for their ability to predict correctly the experimental results of a large set of nucleic acid sequences containing A-tracts as well as GC-rich motifs. This set contained around 40 synthetic as well as natural sequences whose solution properties have been well characterized experimentally. All three models could account reasonably well for curvature in the various DNA sequences. The CS model, where dinucleotide parameters are calculated from crystal structure data, consistently shows slightly better correlation with experimental data. Our simple analysis also indicates that presently available trinucleotide parameters fail to predict curvature in some of the well-characterized sequences. The study shows that the dinucleotide parameters with some further refinement can be used to predict sequence-dependent curvature correctly in genomic sequences.  相似文献   

3.
Microsatellite instability induced by hydrogen peroxide in Escherichia coli   总被引:1,自引:0,他引:1  
Damage to DNA by reactive oxygen species may be a significant source of endogenous mutagenesis in aerobic organisms. Using a selective assay for microsatellite instability in E. coli, we have asked whether endogenous oxidative mutagenesis can contribute to genetic instability. Instability of repetitive sequences, both in intronic sequences and within coding regions, is a hallmark of genetic instability in human cancers. We demonstrate that exposure of E. coli to low levels of hydrogen peroxide increases the frequency of expansions and deletions within dinucleotide repetitive sequences. Sequencing of the repetitive sequences and flanking non-repetitive regions in mutant clones demonstrated the high specificity for alterations with the repeats. All of the 183 mutants sequenced displayed frameshift alterations within the microsatellite repeats, and no base substitutions or frameshift mutations occurred within the flanking non-repetitive sequences. We hypothesize that endogenous oxidative damage to DNA can increase the frequency of strand slippage intermediates occurring during DNA replication or repair synthesis, and contribute to genomic instability.  相似文献   

4.
High-throughput sequencing technologies have opened up a new avenue for studying extinct organisms. Here we identify and quantify biases introduced by particular characteristics of ancient DNA samples. These analyses demonstrate the importance of closely related genomic sequence for correctly identifying and classifying bona fide endogenous DNA fragments. We show that more accurate genome divergence estimates from ancient DNA sequence can be attained using at least two outgroup genomes and appropriate filtering.  相似文献   

5.
Many bacteria are naturally competent, able to actively transport environmental DNA fragments across their cell envelope and into their cytoplasm. Because incoming DNA fragments can recombine with and replace homologous segments of the chromosome, competence provides cells with a potent mechanism of horizontal gene transfer as well as access to the nutrients in extracellular DNA. This review starts with an introductory overview of competence and continues with a detailed consideration of the DNA uptake specificity of competent proteobacteria in the Pasteurellaceae and Neisseriaceae. Species in these distantly related families exhibit strong preferences for genomic DNA from close relatives, a self-specificity arising from the combined effects of biases in the uptake machinery and genomic overrepresentation of the sequences this machinery prefers. Other competent species tested lack obvious uptake bias or uptake sequences, suggesting that strong convergent evolutionary forces have acted on these two families. Recent results show that uptake sequences have multiple “dialects,” with clades within each family preferring distinct sequence variants and having corresponding variants enriched in their genomes. Although the genomic consensus uptake sequences are 12 and 29 to 34 bp, uptake assays have found that only central cores of 3 to 4 bp, conserved across dialects, are crucial for uptake. The other bases, which differ between dialects, make weaker individual contributions but have important cooperative interactions. Together, these results make predictions about the mechanism of DNA uptake across the outer membrane, supporting a model for the evolutionary accumulation and stability of uptake sequences and suggesting that uptake biases may be more widespread than currently thought.  相似文献   

6.
Along the gene, nucleotides in various codon positions tend to exert a slight but observable influence on the nucleotide choice at neighboring positions. Such context biases are different in different organisms and can be used as genomic signatures. In this paper, we will focus specifically on the dinucleotide composed of a third codon position nucleotide and its succeeding first position nucleotide. Using the 16 possible dinucleotide combinations, we calculate how well individual genes conform to the observed mean dinucleotide frequencies of an entire genome, forming a distance measure for each gene. It is found that genes from different genomes can be separated with a high degree of accuracy, according to these distance values. In particular, we address the problem of recent horizontal gene transfer, and how imported genes may be evaluated by their poor assimilation to the host's context biases. By concentrating on the third- and succeeding first position nucleotides, we eliminate most spurious contributions from codon usage and amino-acid requirements, focusing mainly on mutational effects. Since imported genes are expected to converge only gradually to genomic signatures, it is possible to question whether a gene present in only one of two closely related organisms has been imported into one organism or deleted in the other. Striking correlations between the proposed distance measure and poor homology are observed when Escherichia coli genes are compared to Salmonella typhi, indicating that sets of outlier genes in E. coli may contain a high number of genes that have been imported into E. coli, and not deleted in S. typhi. Received: 16 January 2001 / Accepted: 30 August 2001  相似文献   

7.
An estimated 80% of genomic DNA in eukaryotes is packaged as nucleosomes, which, together with the remaining interstitial linker regions, generate higher order chromatin structures [1]. Nucleosome sequences isolated from diverse organisms exhibit ∼10 bp periodic variations in AA, TT and GC dinucleotide frequencies. These sequence elements generate intrinsically curved DNA and help establish the histone-DNA interface. We investigated an important unanswered question concerning the interplay between chromatin organization and genome evolution: do the DNA sequence preferences inherent to the highly conserved histone core exert detectable natural selection on genomic divergence and polymorphism? To address this hypothesis, we isolated nucleosomal DNA sequences from Drosophila melanogaster embryos and examined the underlying genomic variation within and between species. We found that divergence along the D. melanogaster lineage is periodic across nucleosome regions with base changes following preferred nucleotides, providing new evidence for systematic evolutionary forces in the generation and maintenance of nucleosome-associated dinucleotide periodicities. Further, Single Nucleotide Polymorphism (SNP) frequency spectra show striking periodicities across nucleosomal regions, paralleling divergence patterns. Preferred alleles occur at higher frequencies in natural populations, consistent with a central role for natural selection. These patterns are stronger for nucleosomes in introns than in intergenic regions, suggesting selection is stronger in transcribed regions where nucleosomes undergo more displacement, remodeling and functional modification. In addition, we observe a large-scale (∼180 bp) periodic enrichment of AA/TT dinucleotides associated with nucleosome occupancy, while GC dinucleotide frequency peaks in linker regions. Divergence and polymorphism data also support a role for natural selection in the generation and maintenance of these super-nucleosomal patterns. Our results demonstrate that nucleosome-associated sequence periodicities are under selective pressure, implying that structural interactions between nucleosomes and DNA sequence shape sequence evolution, particularly in introns.  相似文献   

8.
No simple model exists that accurately describes the melting behavior and breathing dynamics of double-stranded DNA as a function of nucleotide sequence. This is especially true for homogenous and periodic DNA sequences, which exhibit large deviations in melting temperature from predictions made by additive thermodynamic contributions. Currently, no method exists for analysis of the DNA breathing dynamics of repeats and of highly G/C- or A/T-rich regions, even though such sequences are widespread in vertebrate genomes. Here, we extend the nonlinear Peyrard–Bishop–Dauxois (PBD) model of DNA to include a sequence-dependent stacking term, resulting in a model that can accurately describe the melting behavior of homogenous and periodic sequences. We collect melting data for several DNA oligos, and apply Monte Carlo simulations to establish force constants for the 10 dinucleotide steps (CG, CA, GC, AT, AG, AA, AC, TA, GG, TC). The experiments and numerical simulations confirm that the GG/CC dinucleotide stacking is remarkably unstable, compared with the stacking in GC/CG and CG/GC dinucleotide steps. The extended PBD model will facilitate thermodynamic and dynamic simulations of important genomic regions such as CpG islands and disease-related repeats.  相似文献   

9.
In this study we investigated the correlation between dinucleotide relative abundance values (the genomic signature) obtained from bacterial whole-genome sequences and two parameters widely used for bacterial classification, 16S rDNA sequence similarity and DNA-DNA hybridisation values. Twenty-eight completely sequenced bacterial genomes were included in the study. The correlation between the genomic signature and DNA-DNA hybridisation values was high and taxa that showed less than 30% DNA-DNA binding will in general not have dinucleotide relative abundance dissimilarity (delta*) values below 40. On the other hand, taxa showing more than 50% DNA-DNA binding will not have delta* values higher than 17. Our data indicate that the overall correlation between genomic signature and 16S rDNA sequence similarity is low, except for closely related organisms (16S rDNA similarity >94%). Statistical analysis of delta* values between different subgroups of the Proteobacteria indicate that the beta- and gamma-Proteobacteria are more closely related to each other than to the other subgroups of the Proteobacteria and that the alpha- and epsilon-Proteobacteria form clearly separate subgroups. Using the genomic signature we have also predicted DNA-DNA binding values for fastidious or unculturable endosymbionts belonging to the genera Rickettsia, Wigglesworthia and Buchnera.  相似文献   

10.
Dinucleotide usage is known to vary in the genomes of organisms. The dinucleotide usage profiles or genome signatures are similar for sequence samples taken from the same genome, but are different for taxonomically distant species. This concept of genome signatures has been used to study several organisms including viruses, to elucidate the signatures of evolutionary processes at the genome level. Genome signatures assume greater importance in the case of host–pathogen interactions, where molecular interactions between the two species take place continuously, and can influence their genomic composition. In this study, analyses of whole genome sequences of the HIV-1 subtype B, a retrovirus that caused global pandemic of AIDS, have been carried out to analyse the variation in genome signatures of the virus from 1983 to 2007. We show statistically significant temporal variations in some dinucleotide patterns highlighting the selective evolution of the dinucleotide profiles of HIV-1 subtype B, possibly a consequence of host specific selection.  相似文献   

11.
Inoue S  Takahashi K  Ohta M 《Genomics》1999,57(1):169-172
A method was developed for effective isolation of trinucleotide repeats from genomic DNA. This method is based on the DNA polymerase reaction, which is restricted with only two or three nucleotide substrates and primed by biotinylated oligonucleotide probes. Sequences are then isolated by a streptavidin biotin-trapping method. More than 80% of the clones from each library contained more than eight trinucleotide repeats. Sequence analysis showed that the characteristic dinucleotide flanking sequences usually confronting various trinucleotide repeats are not found in the vicinity of CAG repeats, suggesting that CAG repeats may have been generated through a mechanism different from that of other trinucleotide repeats.  相似文献   

12.
The increasing number of whole genomic sequences of microorganisms has led to the complexity of genome-wide annotation and gene sequence comparison among multiple microorganisms. To address this problem, we have developed nWayComp software that compares DNA and protein sequences of phylogenetically-related microorganisms. This package integrates a series of bioinformatics tools such as BLAST, ClustalW, ALIGN, PHYLIP and PRIMER3 for sequence comparison. It searches for homologous sequences among multiple organisms and identifies genes that are unique to a particular organism. The homologous gene sets are then ranked in the descending order of the sequence similarity. For each set of homologous sequences, a table of sequence identity among homologous genes along with sequence variations such as SNPs and INDELS is developed, and a phylogenetic tree is constructed. In addition, a common set of primers that can amplify all the homologous sequences are generated. The nWayComp package provides users with a quick and convenient tool to compare genomic sequences among multiple organisms at the whole-genome level.  相似文献   

13.
Exploiting dinucleotide microsatellites conserved among mammalian species   总被引:3,自引:0,他引:3  
Dinucleotide microsatellites are useful for gene mapping projects. Depending upon definition of conservation, published estimates of dinucleotide microsatellite conservation levels vary dramatically (30% to 100%). This study focused on well-characterized genes that contain microsatellites in the human genome. The objective was to examine the feasibility of developing microsatellite markers within genes on the basis of the assumption of microsatellite conservation across distantly related species. Eight genes (Gamma-actin, carcinoembryonic antigen, apolipoprotein A-II, cardiac beta myosin heavy chain, laminin B2 chain, MHC class I CD8 alpha chain, c-reactive protein, and retinoblastoma susceptibility protein) containing large dinucleotide repeat units (N ≥ 15), complete genomic structure information, and homologous gene sequences in a second species were selected. Heterologous primers were designed from conserved exon sequences flanking a microsatellite motif. PCR products from bovine and porcine genomic DNA were tested for the presence of microsatellite sequences by Southern blot hybridization with biotin-labeled (CA)12 oligonucleotides. Fragments containing microsatellites were cloned and sequenced. Homology was verified by sequence comparisons between human and corresponding bovine or porcine fragments. Four of sixteen (25%) cross-amplified PCR products contained dinucleotide repetitive sequences with repeat unit lengths of 5 to 23. Two dinucleotide repetitive sequences showed microsatellite length polymorphism, and an additional sequence displayed single-strand conformational polymorphism. Results from this study suggest that exploitation of conserved microsatellite sequences is a useful approach for developing specific genetic markers for comparative mapping purposes. Received: 7 July 1995 / Accepted: 28 September 1995  相似文献   

14.
The similarity of two nucleotide sequences is often expressed in terms of evolutionary distance, a measure of the amount of change needed to transform one sequence into the other. Given two sequences with a small distance between them, can their similarity be explained by their base composition alone? The nucleotide order of these sequences contributes to their similarity if the distance is much smaller than their average permutation distance, which is obtained by calculating the distances for many random permutations of these sequences. To determine whether their similarity can be explained by their dinucleotide and codon usage, random sequences must be chosen from the set of permuted sequences that preserve dinucleotide and codon usage. The problem of choosing random dinucleotide and codon-preserving permutations can be expressed in the language of graph theory as the problem of generating random Eulerian walks on a directed multigraph. An efficient algorithm for generating such walks is described. This algorithm can be used to choose random sequence permutations that preserve (1) dinucleotide usage, (2) dinucleotide and trinucleotide usage, or (3) dinucleotide and codon usage. For example, the similarity of two 60-nucleotide DNA segments from the human beta-1 interferon gene (nucleotides 196-255 and 499-558) is not just the result of their nonrandom dinucleotide and codon usage.   相似文献   

15.
A plasmid, AWZ1, that contained a dinucleotide (GT)n repeat was identified from a chromosome 21-specific genomic library. When amplified by PCR from human genomic DNA, the repeat length was highly polymorphic between individuals; its location, D21S215, was mapped in the CEPH pedigrees by linkage analysis to the pericentromeric region of chromosome 21. It is the closest polymorphic marker to alphoid sequences on this chromosome.  相似文献   

16.
Non-additivity in protein-DNA binding   总被引:3,自引:0,他引:3  
  相似文献   

17.
Species-specific patterns of DNA bending and sequence.   总被引:16,自引:6,他引:10       下载免费PDF全文
Nucleotide sequences in the GenEMBL database were analyzed using strategies designed to reveal species-specific patterns of DNA bending and DNA sequence. The results uncovered striking species-dependent patterns of bending with more variations among individual organisms than between prokaryotes and eukaryotes. The frequency of bent sites in sequences from different bacteria was related to genomic A + T content and this relationship was confirmed by electrophoretic analysis of genomic DNA. However, base composition was not an accurate predictor for DNA bending in eukaryotes. Sequences from C. elegans exhibited the highest frequency of bent sites in the database and the RNA polymerase II locus from the nematode was the most bent gene in GenEMBL. Bent DNA extended throughout most introns and gene flanking segments from C.elegans while exon regions lacked A-tract bending characteristics. Independent evidence for the strong bending character of this genome was provided by electrophoretic studies which revealed that a large number of the fragments from C.elegans DNA exhibited anomalous gel mobilities when compared to genomic fragments from over 20 other organisms. The prevalence of bent sites in this genome enabled us to detect selectively C.elegans sequences in a computer search of the database using as probes C.elegans introns, bending elements, and a 20 nucleotide consensus sequence for bent DNA. This approach was also used to provide additional examples of species-specific sequence patterns in eukaryotes where it was shown that (A) greater than or equal to 10 and (A.T) greater than or equal to 5 tracts are prevalent throughout the untranslated DNA of D.discodium and P.falciparum, respectively. These results provide new insight into the organization of eukaryotic DNA because they show that species-specific patterns of simple sequences are found in introns and in other untranslated regions of the genome.  相似文献   

18.
19.
A set of 16 kinds of dinucleotide compositions was used to analyzethe protein-encoding nucleotide sequences in nine complete genomes:Escherichia coli, Haemophilus influenzae, Helicobacter pylori,Mycoplasma genitalium, Mycoplasma pneumoniae, Synechocystissp., Methanococcus jannaschii, Archaeoglobus fulgidus, and Saccharomycescerevisiae. The dinucleotide composition was significantly differentbetween the organisms. The distribution of genes from an organismwas clustered around its center in the dinucleotide compositionspace. The genes from closely related organisms such as Gram-negativebacteria, mycoplasma species and eukaryotes showed some overlapin the space. The genes from nine complete genomes togetherwith those from human were discriminated into respective clusterswith 80% accuracy using the dinucleotide composition alone.The composition data estimated from a whole genome was closeto that obtained from genes, indicating that the characteristicfeature of dinucleotides holds not only for protein coding regionsbut also noncoding regions. When a dendrogram was constructedfrom the disposition of the clusters in the dinucleotide space,it resembled the real phylogenetic tree. Thus, the distinctfeature observed in the dinucleotide composition may reflectthe phylogenetic relationship of organisms.  相似文献   

20.
Comparison of genomic DNA sequences: solved and unsolved problems   总被引:5,自引:0,他引:5  
MOTIVATION: The DNA sequences of entire genomes are being determined at a rapid rate. Whereas initial genome sequencing efforts were for organisms chosen to be widely spaced in the tree of life, there is a growing emphasis on projects to sequence a species that is sufficiently similar to an already-sequenced species to allow direct comparison of those two DNA sequences. This and other changes in genome sequencing strategies have created a strong need for new methods to compare genomic sequences. RESULTS: We sketch the current state of software for comparing genomic DNA sequences and outline research directions that we believe are likely to result in important advances in practice.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号