首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Repetitive DNA sequences in the rice genome comprise more than half of the nuclear DNA. The isolation and characterization of these repetitive DNA sequences should lead to a better understanding of rice chromosome structure and genome organization. We report here the characterization and chromosome localization of a chromosome 5-specific repetitive DNA sequence. This repetitive DNA sequence was estimated to have at least 900 copies. DNA sequence analysis of three genomic clones which contain the repeat unit indicated that the DNA sequences have two sub-repeat units of 37 bp and 19 bp, connected by 30-to 90-bp short sequences with high similarity. RFLP mapping and physical mapping by fluorescence in situ hybridization (FISH) indicated that almost all copies of the repetitive DNA sequence are located in the centromeric heterochromatic region of the long arm of chromosome 5. The strategy for cloning such repetitive DNA sequences and their uses in rice genome research are discussed.  相似文献   

2.
DNA序列信息的一种新的测度   总被引:4,自引:3,他引:1  
根据信息理论给出了测度DNA序列信息的一种新的方法,获得DNA序列4个层次的信息量测度:Ib,If(1),If(2)andIf(3),这4种信息测度可分别用来测度DNA的碱基序列、密码子序列、编码蛋白质序列和功能蛋白质序列的信息量。从M.edulis的线粒体基因组中两个较短的编码蛋白质的DNA序列和使用具有不同倍性的间并密码子组组成的模拟DNA序列中所获得计算结果表明,这些信息测度确实能用来揭示所  相似文献   

3.
DNA barcoding is a promising approach to the diagnosis of biological diversity in which DNA sequences serve as the primary key for information retrieval. Most existing software for evolutionary analysis of DNA sequences was designed for phylogenetic analyses and, hence, those algorithms do not offer appropriate solutions for the rapid, but precise analyses needed for DNA barcoding, and are also unable to process the often large comparative datasets. We developed a flexible software tool for DNA taxonomy, named TaxI. This program calculates sequence divergences between a query sequence (taxon to be barcoded) and each sequence of a dataset of reference sequences defined by the user. Because the analysis is based on separate pairwise alignments this software is also able to work with sequences characterized by multiple insertions and deletions that are difficult to align in large sequence sets (i.e. thousands of sequences) by multiple alignment algorithms because of computational restrictions. Here, we demonstrate the utility of this approach with two datasets of fish larvae and juveniles from Lake Constance and juvenile land snails under different models of sequence evolution. Sets of ribosomal 16S rRNA sequences, characterized by multiple indels, performed as good as or better than cox1 sequence sets in assigning sequences to species, demonstrating the suitability of rRNA genes for DNA barcoding.  相似文献   

4.
Determination of window size for analyzing DNA sequences   总被引:4,自引:0,他引:4  
Summary DNA sequences are generally not random sequences. To show such nonrandomness visually, DNA sequence data are often plotted as moving averages for a certain length of window slid along a sequence. Here a simple algorithm is presented for determining the window size and for finding a nonrandom region of sequence.  相似文献   

5.
Base sequence studies of 300 nucleotide renatured repeated human DNA clones   总被引:117,自引:0,他引:117  
A band of 300 nucleotide long duplex DNA is released by treating renatured repeated human DNA with the single strand-specific endonuclease S1. Since many of the interspersed repeated sequences in human DNA are 300 nucleotides long, this band should be enriched in such repeats. We have determined the nucleotide sequences of 15 clones constructed from these 300 nucleotide S1-resistant repeats. Ten of these cloned sequences are members of the Alu family of interspersed repeats. These ten sequences share a recognizable consensus sequence from which individual clones have an average divergence of 12.8%. The 300 nucleotide Alu family consensus sequence has a dimeric structure and was evidently formed from a head to tail duplication of an ancestral monomeric sequence. Three of the remaining clones are variations on a simple pentanucleotide sequence previously reported for human satellite III DNA. Two of the 15 clones have distinct and complex sequences and may represent other families of interspersed repeated sequences.  相似文献   

6.
Graphical representation of DNA sequences is one of the most popular techniques for alignment-free sequence comparison. Here, we propose a new method for the feature extraction of DNA sequences represented by binary images, by estimating the similarity between DNA sequences using the frequency histograms of local bitmap patterns of images. Our method shows linear time complexity for the length of DNA sequences, which is practical even when long sequences, such as whole genome sequences, are compared. We tested five distance measures for the estimation of sequence similarities, and found that the histogram intersection and Manhattan distance are the most appropriate ones for phylogenetic analyses.  相似文献   

7.
8.
A database of the structural properties of all 32,896 unique DNA octamer sequences has been calculated, including information on stability, the minimum energy conformation and flexibility. The contents of the database have been analysed using a variety of Euclidean distance similarity measures. A global comparison of sequence similarity with structural similarity shows that the structural properties of DNA are much less diverse than the sequences, and that DNA sequence space is larger and more diverse than DNA structure space. Thus, there are many very different sequences that have very similar structural properties, and this may be useful for identifying DNA motifs that have similar functional properties that are not apparent from the sequences. On the other hand, there are also small numbers of almost identical sequences that have very different structural properties, and these could give rise to false-positives in methods used to identify function based on sequence alignment. A simple validation test demonstrates that structural similarity can differentiate between promoter and non-promoter DNA. Combining structural and sequence similarity improves promoter recall beyond that possible using either similarity measure alone, demonstrating that there is indeed information available in the structure of double-helical DNA that is not readily apparent from the sequence.  相似文献   

9.
Ten new wheat γ-gliadin gene sequences are reported and an analysis of γ-gliadin gene family structure is carried out using all known γ-gliadin sequences. The new sequences comprise four genomic clones with significantly more flanking DNA than previously reported, and six cDNA clones from a wheat endosperm EST project. Analysis of extended flanking DNA from the genomic clones indicates the limits of conservation of γ-gliadin DNA sequence that are similar to those previously found with other gliadin and glutenin genes and that are theorized to define the DNA sequence necessary for gene control. Most of the flanking DNA is not homologous to any reported DNA sequence, and one flanking region contains the first MITE-like (miniature inverted transposable element) DNA sequence associated with gliadin genes. About a quarter of the encoded polypeptides would contain a free cysteine residue – an observation that may relate to reports that at least some gliadins can participate in wheat endosperm glutenin polymer formation. The new sequences represent both genes closely related to those previously reported and a new sub-class of γ-gliadins.  相似文献   

10.
11.
SEGMENT: identifying compositional domains in DNA sequences   总被引:2,自引:0,他引:2  
MOTIVATION: DNA sequences are formed by patches or domains of different nucleotide composition. In a few simple sequences, domains can simply be identified by eye; however, most DNA sequences show a complex compositional heterogeneity (fractal structure), which cannot be properly detected by current methods. Recently, a computationally efficient segmentation method to analyse such nonstationary sequence structures, based on the Jensen-Shannon entropic divergence, has been described. Specific algorithms implementing this method are now needed. RESULTS: Here we describe a heuristic segmentation algorithm for DNA sequences, which was implemented on a Windows program (SEGMENT). The program divides a DNA sequence into compositionally homogeneous domains by iterating a local optimization procedure at a given statistical significance. Once a sequence is partitioned into domains, a global measure of sequence compositional complexity (SCC), accounting for both the sizes and compositional biases of all the domains in the sequence, is derived. SEGMENT computes SCC as a function of the significance level, which provides a multiscale view of sequence complexity.  相似文献   

12.
The arrangements of inverted-repeated and repeated DNA sequences in the human genome have been investigated by an electron microscope method. The arrangement of the interspersed repeated DNA sequences is found to be similar to the corresponding arrangement found in Xenopus. This arrangement consists of 300-nucleotide-long repeated DNA sequences interspersed with roughly gene-size single-copy DNA sequences. The inverted-repeated sequences are also 300 nucleotides in length and are interspersed with the other DNA sequence classes.Most inverted-repeated sequences (64%) are spaced by another sequence which is recognized by electron microscopy as a single-stranded loop in a hairpin structure. The average length of this spacer loop is 1.6 kilobases. Although some pairs of inverted-repeated sequences are clustered, most seem to be randomly distributed throughout the genome. The average distance separating two pairs of inverted-repeated sequences is 10 to 20 kilobases. The interspersed repeated sequences and inverted-repeated sequences are arranged simultaneously in a portion of the human genome resulting in an interspersion of all three sequence classes.  相似文献   

13.
In order to study the derivation of the macronuclear genome from the micronuclear genome in Oxytricha nova micronuclear DNA was partially digested with EcoRI, size fractionated, and then cloned in the lambda phage Charon 8. Clones were selected a) at random b) by hybridization with macronuclear DNA or c) by hybridization with clones of macronuclear DNA. One group of these clones contains only unique sequence DNA, and all of these had sequences that were homologous to macronuclear sequences. The number of macronuclear genes with sequences homologous to these micronuclear clones indicates that macronuclear sequences are clustered in the micronuclear genome. Many micronuclear clones contain repetitive DNA sequences and hybridize to numerous EcoRI fragments of total micronuclear DNA, yielding similar but non-identical patterns. Some micronuclear clones containing these repetitive sequences also contained unique sequence DNA that hybridized to a macronuclear sequence. These clones define a major interspersed repetitive sequence family in the micronuclear genome that is eliminated during formation of the macronuclear genome.  相似文献   

14.
传统的DNA序列可视化模型局限于短DNA序列的可视化,并且缺乏对可视化图形的通用分析方法。因此,文章提出了一种基于图像的DNA序列可视化模型,这种模型通过将一维的DNA序列转换为二维的256色的灰度图像,可以实现长DNA序列的可视化,具有很高的空间紧密性。借助成熟的图像处理方法来分析DNA可视化图像,可以获取原始DNA序列的规模、4种不同碱基的分布、无序程度等重要信息。通过比较不同DNA序列的可视化图像,可以获取这些序列的相似性信息。  相似文献   

15.
Digital signal processing (DSP) techniques for biological sequence analysis continue to grow in popularity due to the inherent digital nature of these sequences. DSP methods have demonstrated early success for detection of coding regions in a gene. Recently, these methods are being used to establish DNA gene similarity. We present the inter-coefficient difference (ICD) transformation, a novel extension of the discrete Fourier transformation, which can be applied to any DNA sequence. The ICD method is a mathematical, alignment-free DNA comparison method that generates a genetic signature for any DNA sequence that is used to generate relative measures of similarity among DNA sequences. We demonstrate our method on a set of insulin genes obtained from an evolutionarily wide range of species, and on a set of avian influenza viral sequences, which represents a set of highly similar sequences. We compare phylogenetic trees generated using our technique against trees generated using traditional alignment techniques for similarity and demonstrate that the ICD method produces a highly accurate tree without requiring an alignment prior to establishing sequence similarity.  相似文献   

16.
S M Halling  N Kleckner 《Cell》1982,28(1):155-163
Transposon Tn10 inserts at many sites in the bacterial chromosome, but preferentially inserts at particular hotspots. We believe we have identified the target DNA signal responsible for this specificity. We have determined the DNA sequences of 11 Tn10 insertion sites and identified a particular 6 base pair (bp) symmetrical consensus sequence (GCTNAGC) common to those sites. The sequences at some sites differ from the consensus sequence but only in limited and well defined ways. The sequences at some sites differ from the consensus sequence than do sequences at other sites, and the consensus sequence and closely related sequences are generally absent from potential target regions where Tn10 is known not to insert. Other aspects of the target DNA can significantly influence the efficiency with which a particular target site sequence is used. The 6 bp consensus sequence is symmetrically located within the 9 bp target DNA sequence that is cleaved and duplicated during Tn10 insertion. This juxtaposition of recognition and cleavage sites plus the symmetry of the perfect consensus sequence suggest that the target DNA may be both recognized and cleaved by the symmetrically disposed subunits of a single protein, as suggested for type II restriction endonucleases. There is plausible homology between the consensus sequence and the very ends of Tn10, compatible with recognition of transposon ends and target DNA by the same protein. The sequences of actual insertion sites deviate from the perfect consensus sequence in a way which suggests that the 6 bp specificity determinant may be recognized through protein-DNA contacts along the major groove of the DNA double helix.  相似文献   

17.
DNA sequencing has resulted in an abundance of data on DNA sequences for various species. Hence, the characterization and comparison of sequences become more important but still difficult tasks. In this paper, we first give a 2-D ladderlike graphical representation for the characteristic sequences of a DNA sequence, and then construct a 3-component vector, in which the normalized ALE-indices extracted from such three 2-D graphs via D/D matrices are individual components, to characterize the DNA sequence. The examination of similarities/dissimilarities among sequences of the beta-globin genes of different species illustrates the utility of the approach.  相似文献   

18.
DNA sequence is an important determinant of the positioning, stability, and activity of nucleosomes, yet the molecular basis of these effects remains elusive. A "consensus DNA sequence" for nucleosome positioning has not been reported and, while certain DNA sequence preferences or motifs for nucleosome positioning have been discovered, how they function is not known. Here, we report that an unexpected observation concerning the reassembly of nucleosomes during salt gradient dialysis has allowed a breakthrough in our efforts to identify the nucleosomal locations of the DNA sequence motifs that dominate histone-DNA interactions and nucleosome positioning. We conclude that a previous selection experiment for high-affinity, nucleosome-forming DNA sequences exerted selective pressure chiefly on the central stretch of the nucleosomal DNA. This observation implies that algorithms for aligning the selected DNA sequences should seek to optimize the alignment over much less than the full 147 bp of nucleosomal DNA. A new alignment calculation implemented these ideas and successfully aligned 19 of the 41 sequences in a non-redundant database of selected high-affinity, nucleosome-positioning sequences. The resulting alignment reveals strong conservation of several stretches within a central 71 bp of the nucleosomal DNA. The alignment further reveals an inherent palindromic symmetry in the selected DNAs; it makes testable predictions of nucleosome positioning on the aligned sequences and for the creation of new positioning sequences, both of which are upheld experimentally; and it suggests new signals that may be important in translational nucleosome positioning.  相似文献   

19.
基于PC/Linux的核酸序列分析系统的构建及其应用   总被引:13,自引:2,他引:11  
基于PC机和Linux操作系统, 利用Phred/Phrap/Consed软件和Blast软件, 构建了核酸序列大规模自动分析系统. 该套系统可自动完成从测序峰图向核酸序列的转化、载体序列去除、序列自动拼接、重复序列鉴定以及序列的相似性分析, 可加速对大规模测序数据的分析和利用.  相似文献   

20.
The CapR protein is an ATP hydrolysis-dependent protease as well as a DNA-stimulated ATPase and a nucleic acid-binding protein. The sequences of the 5' end of the capR (lon) gene DNA and N-terminal end of the CapR protein were determined. The sequence of DNA that specifies the N-terminal portion of the CapR protein was identified by comparing the amino acid sequence of the CapR protein with the sequence predicted from the DNA. The DNA and protein sequences established that the mature protein is not processed from a precursor form. No sequence corresponding to an SOS box was found in the 5' sequence of DNA. There were sequences that corresponded to a putative -35 and -10 region for RNA polymerase binding. The capR (lon) gene was recently identified as one of 17 heat shock genes in Escherichia coli that are positively regulated by the product of the htpR gene. A comparison of the 5' DNA region of the capR gene with that of several other heat shock genes revealed possible consensus sequences.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号