首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 562 毫秒
1.
A wavelet transform of the DNA "walk" constructed from a genomic sequence offers a direct visualization of short and long-range patterns in nucleotide sequences. We study sequences that encode diverse biological functions, taken from a variety of genomes. Pattern irregularities in the transform are frequently associated with sequences of biological interest. Exonic regions, for example, visualize differently under wavelet analysis than introns, and ribosomal RNA regions display distinct universal signatures. DNA walk wavelet analysis can provide a sensitive and rapid assessment of the putative biological significance of genomic DNA.  相似文献   

2.
We investigated the thermodynamic stability of double-stranded DNAs with an oxidative DNA lesion, 2-hydroxyadenine (2-OH-Ade), in two different sequence contexts (5′-GA*C-3′ and 5′-TA*A-3′, A* represents 2-OH-Ade). When an A*–N pair (N, any nucleotide base) was located in the center of a duplex, the thermodynamic stabilities of the duplexes were similar for all the natural bases except A (N = T, C and G). On the other hand, for the duplexes with the A*–N pair at the end, which mimic the nucleotide incorporation step, the stabilities of the duplexes were dependent on their sequence. The order of stability is T > G > C >> A in the 5′-GA*C-3′ sequences and T > A > C > G in the 5′-TA*A-3′ sequences. Because T/G/C and T/A are nucleotides incorporated opposite to 2-OH-Ade in the 5′-GA*C-3′ and 5′-TA*A-3′ sequences, respectively, these results agree with the tendency of mutagenic misincorporation of the nucleotides opposite to 2-OH-Ade in vitro. Thus, the thermodynamic stability of the A*–N base pair may be an important factor for the mutation spectra of 2-OH-Ade.  相似文献   

3.
为了深入研究基因组序列的多重分形性质,首先选取12条较长的DNA序列,并根据此12条DNA序列的编码/非编码片段将DNA序列转换成相应的12条时间序列,其次对这12个时间序列进行多重分形Hurst分析,计算它们的Hurst指数,并且利用Hurst指数分析序列的自相似性,进一步将得到的Hurst指数与DNA一维游走模型相比较,发现12条序列均具有长程相关性,这说明DNA序列中确实存在着长程相关现象。  相似文献   

4.

Background

The relatively short read lengths from next generation sequencing (NGS) technologies still pose a challenge for de novo assembly of complex mammal genomes. One important solution is to use paired-end (PE) sequence information experimentally obtained from long-range DNA fragments (>1 kb). Here, we characterize and extend a long-range PE library construction method based on direct intra-molecule ligation (or molecular linker-free circularization) for NGS.

Results

We found that the method performs stably for PE sequencing of 2- to 5- kb DNA fragments, and can be extended to 10–20 kb (and even in extremes, up to ∼35 kb). We also characterized the impact of low quality input DNA on the method, and develop a whole-genome amplification (WGA) based protocol using limited input DNA (<1 µg). Using this PE dataset, we accurately assembled the YanHuang (YH) genome, the first sequenced Asian genome, into a scaffold N50 size of >2 Mb, which is over100-times greater than the initial size produced with only small insert PE reads(17 kb). In addition, we mapped two 7- to 8- kb insertions in the YH genome using the larger insert sizes of the long-range PE data.

Conclusions

In conclusion, we demonstrate here the effectiveness of this long-range PE sequencing method and its use for the de novo assembly of a large, complex genome using NGS short reads.  相似文献   

5.
We show that repeated sequences, like palindromes (local repetitions) and homologies between two different nucleotide sequences (motifs along the genome), compose a self-similar (fractal) pattern in mitochondrial DNA. This self-similarity comes from the looplike structures distributed along the genome. The looplike structures generate scaling laws in a pseudorandom DNA walk constructed from the sequence, called a Lévy flight. We measure the scaling laws from the generalized fractal dimension and singularity spectrum for mitochondrial DNA walks for 35 different species. In particular, we report characteristic loop distributions for mammal mitochondrial genomes.  相似文献   

6.
DNA sequence analysis by oligonucleotide binding is often affected by interference with the secondary structure of the target DNA. Here we describe an approach that improves DNA secondary structure prediction by combining enzymatic probing of DNA by structure-specific 5′-nucleases with an energy minimization algorithm that utilizes the 5′-nuclease cleavage sites as constraints. The method can identify structural differences between two DNA molecules caused by minor sequence variations such as a single nucleotide mutation. It also demonstrates the existence of long-range interactions between DNA regions separated by >300 nt and the formation of multiple alternative structures by a 244 nt DNA molecule. The differences in the secondary structure of DNA molecules revealed by 5′-nuclease probing were used to design structure-specific probes for mutation discrimination that target the regions of structural, rather than sequence, differences. We also demonstrate the performance of structure-specific ‘bridge’ probes complementary to non-contiguous regions of the target molecule. The structure-specific probes do not require the high stringency binding conditions necessary for methods based on mismatch formation and permit mutation detection at temperatures from 4 to 37°C. Structure-specific sequence analysis is applied for mutation detection in the Mycobacterium tuberculosis katG gene and for genotyping of the hepatitis C virus.  相似文献   

7.
8.

Background

Control of breathing, heart rate, and body temperature are interdependent in infants, where instabilities in thermoregulation can contribute to apneas or even life-threatening events. Identifying abnormalities in thermoregulation is particularly important in the first 6 months of life, where autonomic regulation undergoes critical development. Fluctuations in body temperature have been shown to be sensitive to maturational stage as well as system failure in critically ill patients. We thus aimed to investigate the existence of fractal-like long-range correlations, indicative of temperature control, in night time rectal temperature (Trec) patterns in maturing infants.

Methodology/Principal Findings

We measured Trec fluctuations in infants every 4 weeks from 4 to 20 weeks of age and before and after immunization. Long-range correlations in the temperature series were quantified by the correlation exponent, α using detrended fluctuation analysis. The effects of maturation, room temperature, and immunization on the strength of correlation were investigated. We found that Trec fluctuations exhibit fractal long-range correlations with a mean (SD) α of 1.51 (0.11), indicating that Trec is regulated in a highly correlated and hence deterministic manner. A significant increase in α with age from 1.42 (0.07) at 4 weeks to 1.58 (0.04) at 20 weeks reflects a change in long-range correlation behavior with maturation towards a smoother and more deterministic temperature regulation, potentially due to the decrease in surface area to body weight ratio in the maturing infant. α was not associated with mean room temperature or influenced by immunization

Conclusions

This study shows that the quantification of long-range correlations using α derived from detrended fluctuation analysis is an observer-independent tool which can distinguish developmental stages of night time Trec pattern in young infants, reflective of maturation of the autonomic system. Detrended fluctuation analysis may prove useful for characterizing thermoregulation in premature and other infants at risk for life-threatening events.  相似文献   

9.
A dialkyl-substituted anthraquinone derivative was synthesized and ligated to a sequence-directing oligodeoxynucleotide to examine its efficiency and specificity for cross-linking to complementary sequences of DNA. The anthraquinone appendage stabilized spontaneous hybridization of the target and probe sequences through non-covalent interactions, as indicated by thermal denaturation studies. Covalent modification of the target was induced by exposure to near UV light (lambda > 335 nm) to generate cross-linked duplexes in yields as great as 45%. Reaction was dependent on the first unpaired nucleotide extended beyond the duplex formed by association of the target and probe. A specificity of C > T > A = G was determined for modification at this position. The overall site and nucleotide selectivity seems to originate from the chemical requirements of cross-linking and does not likely reflect the dominant solution structure of the complex prior to irradiation.  相似文献   

10.
In contrast to mammals, the evolution of MHC genes in birds appears to be characterized by high rates of gene duplication and concerted evolution. To further our understanding of the evolution of passerine MHC genes, we have isolated class II B sequences from two species of New Zealand robins, the South Island robin (Petroica australis australis), and the endangered Chatham Island black robin (Petroica traversi). Using an RT-PCR based approach we isolated four transcribed class II B MHC sequences from the black robin, and eight sequences from the South Island robin. RFLP analysis indicated that all class II B loci were contained within a single linkage group. Analysis of 3-untranslated region sequences enabled putative orthologous loci to be identified in the two species, and indicated that multiple rounds of gene duplication have occurred within the MHC of New Zealand robins. The orthologous relationships are not retained within the coding region of the gene, instead the sequences group within species. A number of putative gene conversion events were identified across the length of our sequences that may account for this. Exon 2 sequences are highly diverse and appear to have diverged under balancing selection. It is also possible that gene conversion involving short stretches of sequence within exon 2 adds to this diversity. Our study is the first report of putative orthologous MHC loci in passerines, and provides further evidence for the importance of gene duplication and gene conversion in the evolution of the passerine MHC.Nucleotide sequence data reported in this paper are available in the GenBank database under the accession numbers AY258333–AY258335, AY428561–AY428570, and AY530534–AY530535  相似文献   

11.
We have developed a locus-specific DNA target preparation method for highly multiplexed single nucleotide polymorphism (SNP) genotyping called MARA (Multiplexed Anchored Runoff Amplification). The approach uses a single primer per SNP in conjunction with restriction enzyme digested, adapter-ligated human genomic DNA. Each primer is composed of common sequence at the 5′ end followed by locus-specific sequence at the 3′ end. Following a primary reaction in which locus-specific products are generated, a secondary universal amplification is carried out using a generic primer pair corresponding to the oligonucleotide and genomic DNA adapter sequences. Allele discrimination is achieved by hybridization to high-density DNA oligonucleotide arrays. Initial multiplex reactions containing either 250 primers or 750 primers across nine DNA samples demonstrated an average sample call rate of ~95% for 250- and 750-plex MARA. We have also evaluated >1000- and 4000-primer plex MARA to genotype SNPs from human chromosome 21. We have identified a subset of SNPs corresponding to a primer conversion rate of ~75%, which show an average call rate over 95% and concordance >99% across seven DNA samples. Thus, MARA may potentially improve the throughput of SNP genotyping when coupled with allele discrimination on high-density arrays by allowing levels of multiplexing during target generation that far exceed the capacity of traditional multiplex PCR.  相似文献   

12.
Integration of a conjugative plasmid into a bacterial chromosome can promote the transfer of chromosomal DNA to other bacteria. Intraspecies chromosomal conjugation is believed responsible for creating the global pathogens Klebsiella pneumoniae ST258 and Escherichia coli ST1193. Interspecies conjugation is also possible but little is known about the genetic architecture or fitness of such hybrids. To study this, we generated by conjugation 14 hybrids of E. coli and Salmonella enterica. These species belong to different genera, diverged from a common ancestor >100 Ma, and share a conserved order of orthologous genes with ∼15% nucleotide divergence. Genomic analysis revealed that all but one hybrid had acquired a contiguous segment of donor E. coli DNA, replacing a homologous region of recipient Salmonella chromosome, and ranging in size from ∼100 to >4,000 kb. Recombination joints occurred in sequences with higher-than-average nucleotide identity. Most hybrid strains suffered a large reduction in growth rate, but the magnitude of this cost did not correlate with the length of foreign DNA. Compensatory evolution to ameliorate the cost of low-fitness hybrids pointed towards disruption of complex genetic networks as a cause. Most interestingly, 4 of the 14 hybrids, in which from 45% to 90% of the Salmonella chromosome was replaced with E. coli DNA, showed no significant reduction in growth fitness. These data suggest that the barriers to creating high-fitness interspecies hybrids may be significantly lower than generally appreciated with implications for the creation of novel species.  相似文献   

13.
The multifractal analysis of binary images of DNA is studied in order to define a methodological approach to the classification of DNA sequences. This method is based on the computation of some multifractality parameters on a suitable binary image of DNA, which takes into account the nucleotide distribution. The binary image of DNA is obtained by a dot-plot (recurrence plot) of the indicator matrix. The fractal geometry of these images is characterized by fractal dimension (FD), lacunarity, and succolarity. These parameters are compared with some other coefficients such as complexity and Shannon information entropy. It will be shown that the complexity parameters are more or less equivalent to FD, while the parameters of multifractality have different values in the sense that sequences with higher FD might have lower lacunarity and/or succolarity. In particular, the genome of Drosophila melanogaster has been considered by focusing on the chromosome 3r, which shows the highest fractality with a corresponding higher level of complexity. We will single out some results on the nucleotide distribution in 3r with respect to complexity and fractality. In particular, we will show that sequences with higher FD also have a higher frequency distribution of guanine, while low FD is characterized by the higher presence of adenine.  相似文献   

14.
The 50 non-coding bases immediately internal to the telomeric repeats in the two 5′ ends of macronuclear DNA molecules of a group of hypotrichous ciliates are anomalous in composition, consisting of 61% purines and 39% pyrimidines, A>T (ratio of 44:32), and G>C (ratio of 17:7). These ratio imbalances violate parity rule 2, according to which A should equal T and G should equal C within a DNA strand and therefore pyrimidines should equal purines. The purine-rich and base ratio imbalances are in marked contrast to the rest of the non-coding parts of the molecules, which have the theoretically expected purine content of 50%, with A = T and G = C. The ORFs contain an average of 52% purines as a result of bias in codon usage. The 50 bases that flank the 5′ ends of macronuclear sequences in micronuclear DNA (12 cases) consist of ~50% purines. Thus, the 50 bases in the 5′ ends of macronuclear sequences in micronuclear DNA are islands of purine richness in which A>T and G>C. These islands may serve as signals for the excision of macronuclear molecules during macronuclear development. We have found no published reports of coding or non-coding native DNA with such anomalous base composition.  相似文献   

15.
Negatively twisted DNA is essential to many biological functions. Due to torsional stress, duplex DNA can have local, sequence-dependent structural defects. In this work, a thermodynamic model of DNA was built to qualitatively predict the local sequence-dependent mechanical instabilities under torsional stress. The results were compared to both simulation of a coarse-grained model and experiment results. By using the Kirkwood superposition approximation, we built an analytical model to represent the free energy difference ΔW of a hydrogen-bonded basepair between the B-form helical state and the basepair opened (or locally melted) state, within a given sequence under torsional stress. We showed that ΔW can be well approximated by two-body interactions with its nearest-sequence-neighbor basepairs plus a free energy correction due to long-range correlations. This model is capable of rapidly predicting the position and thermodynamics of local defects in a given sequence. The result qualitatively matches with an in vitro experiment for a long DNA sequence (>4000 basepairs). The 12 parameters used in this model can be further quantitatively refined when more experimental data are available.  相似文献   

16.

Background

Analysis of single nucleotide polymorphisms (SNPs) derived from whole-genome studies allows for rapid evaluation of genome-wide diversity, and genomic epidemiology studies of Plasmodium falciparum provide insights into parasite population structure, gene flow, drug resistance and vaccine development. In areas with adequate cold chain facilities, large volumes of leukocyte-depleted patient blood can be frozen for use in parasite genomic analyses. In more remote endemic areas smaller volumes of infected blood are taken by finger prick, and dried and stored on filter paper. These dried blood spots do not generally yield enough concentrated parasite DNA for whole-genome sequencing.

Results

A DNA microarray was designed for use on field samples to type a genome-wide set of SNPs which prior sequencing had shown to be variable in Africa, Southeast Asia, and Papua New Guinea. An algorithm was designed to call SNPs in samples with low parasite DNA. With this new algorithm SNP-calling accuracy of 98% was measured by hybridizing purified DNA from malaria lab strains and comparing calls with SNPs called from full genome sequences. An average accuracy of >98% was likewise obtained for DNA extracted from malaria field samples collected in studies in Southeast Asia, with an average call rate of > 82%.

Conclusion

This new high-density microarray provided high quality SNP calls from a wide range of parasite DNA quantities, and represents a robust tool for genome-wide analysis of malaria parasites in diverse settings.  相似文献   

17.
The rates and patterns of deletions in the human factor IX gene.   总被引:4,自引:2,他引:2       下载免费PDF全文
Deletions are commonly observed in genes with either segments of highly homologous sequences or excessive gene length. However, in the factor IX gene and in most genes, deletions (of > or = 21 bp) are uncommon. We have analyzed DNA from 290 families with hemophilia B (203 independent mutations) and have found 12 deletions > 20 bp. Eleven of these are > 2 kb (range > 3-163 kb), and one is 1.1 kb. The junctions of the four deletions that are completely contained within the factor IX gene have been determined. A novel mutation occurred in patient HB128: the data suggest that a 26.8-kb deletion occurred between two segments of alternating purines and pyrimidines and that a 2.3-kb sense strand segment derived from the deleted region was inserted. For our sample of 203 independent mutations, we estimate the "baseline" rates of deletional mutation per base pair per generation as a function of size. The rate for large (> 2 kb) deletions is exceedingly low. For every mutational event in which a given base is at the junction of a large deletion, there are an estimated 58 microdeletions (< 20 bp) and 985 single-base substitutions at that base. Analysis of the nine reported deletion junctions in the factor IX gene literature reveals that (i) five are associated with inversions, orphan sequences, or sense strand insertions; (ii) four are simple deletions that display an excess of short direct repeats at their junctions; (iii) there is no dramatic clustering of junctions within the gene; and (iv) with the exception of alternating purines and pyrimidines, deletion junctions are not preferentially associated with repetitive DNA.  相似文献   

18.
Study of statistical correlations in DNA sequences   总被引:3,自引:0,他引:3  
Here we present a study of statistical correlations among different positions in DNA sequences and their implications by directly using the autocorrelation function. Such an analysis is possible now because of the availability of large sequences or even complete genomes of many organisms. After describing the way in which the autocorrelation function can be applied to DNA-sequence analysis, we show that long-range correlations, implying scale independence, appear in several bacterial genomes as well as in long human chromosome contigs. The source for such correlations in bacteria, which may extend up to 60 kb in Bacillus subtilis, may be related to massive lateral transfer of compositionally biased genes from other genomes. In the human genome, correlations extend for more than five decades and may be related to the evolution of the ’neogenome’, a modern evolutionary acquisition composed by GC-rich isochores displaying long-range correlations and scale invariance.  相似文献   

19.
Splice sites (SSs)—short nucleotide sequences flanking introns—are under selection for spliceosome binding, and adhere to consensus sequences. However, non-consensus nucleotides, many of which probably reduce SS performance, are frequent. Little is known about the mechanisms maintaining such apparently suboptimal SSs. Here, we study the correlations between strengths of nucleotides occupying different positions of the same SS. Such correlations may arise due to epistatic interactions between positions (i.e., a situation when the fitness effect of a nucleotide in one position depends on the nucleotide in another position), their evolutionary history, or to other reasons. Within both the intronic and the exonic parts of donor SSs, nucleotides that increase (decrease) SS strength tend to co-occur with other nucleotides increasing (respectively, decreasing) it, consistent with positive epistasis. Between the intronic and exonic parts of donor SSs, the correlations of nucleotide strengths tend to be negative, consistent with negative epistasis. In the course of evolution, substitutions at a donor SS tend to decrease the strength of its exonic part, and either increase or do not change the strength of its intronic part. In acceptor SSs, the situation is more complicated; the correlations between adjacent positions appear to be driven mainly by avoidance of the AG dinucleotide which may cause aberrant splicing. In summary, both the content and the evolution of SSs is shaped by a complex network of interdependences between adjacent nucleotides that respond to a range of sometimes conflicting selective constraints.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号