首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The information decomposition (ID) method has been used for searching dinucleotide periodicities, including latent ones, in plant genomes. In nucleotide sequences of genomes of various plants from the Gen-Bank database, 14 766 sequences with a periodicity of two nucleotides have been found at a high level of statistical significance. Classification of the periodicity matrices of the detected DNA sequences has yielded 141 classes of dinucleotide periodicity. Since ID does not detect periodicities with nucleotide deletions or insertions, modified profile analysis (MPA) has been applied to the obtained classes to reveal DNA sequences with dinucleotide periodicities containing nucleotide deletions and insertions. Combined use of ID and MPA has permitted the detection of 80 396 DNA sequences with dinucleotide periodicities in the genomes of various plants. The biological role of dinucleotide periodicity in the detected sequences is discussed.  相似文献   

2.
Method of informational decomposition has been developed, allowing one to reveal hidden periodicity in any symbol sequences. The informational decomposition is calculated without conversion of a symbol sequence into the numerical one, which facilitates finding periodicities in a symbol sequence. The method permits introducing an analog of the autocorrelation function of a symbol sequence. The method developed by us has been applied to reveal hidden periodicities in nucleotide and amino acid sequences, as well as in different poetical texts. Hidden periodicity has been detected in various genes, testifying to their quantum structure. The functional and structural role of hidden periodicity is discussed.  相似文献   

3.
The concept of the phase shift of triplet periodicity (TP) was used for searching potential DNA insertions in genes from 17 bacterial genomes. A mathematical algorithm for detection of these insertions has been developed. This approach can detect potential insertions and deletions with lengths that are not multiples of three bases, especially insertions of relatively large DNA fragments (>100 bases). New similarity measure between triplet matrixes was employed to improve the sensitivity for detecting the TP phase shift. Sequences of 17,220 bacterial genes with each consisting of more than 1,200 bases were analyzed, and the presence of a TP phase shift has been shown in ~16% of analysed genes (2,809 genes), which is about 4 times more than that detected in our previous work. We propose that shifts of the TP phase may indicate the shifts of reading frame in genes after insertions of the DNA fragments with lengths that are not multiples of three bases. A relationship between the phase shifts of TP and the frame shifts in genes is discussed.  相似文献   

4.
Fukushima A  Ikemura T  Kinouchi M  Oshima T  Kudo Y  Mori H  Kanaya S 《Gene》2002,300(1-2):203-211
We used a power spectrum method to identify periodic patterns in nucleotide sequence, and characterized nucleotide sequences that confer periodicities to prokaryotic and eukaryotic genomes and genomes. A 10-bp periodicity was prevalent in hyperthermophilic bacteria and archaebacteria, and an 11-bp periodicity was prevalent in eubacteria. The 10-bp periodicity was also prevalent in the eukaryotes such as the worm Caenorhabditis elegans. Additionally, in the worm genome, a 68-bp periodicity in chromosome I, a 59-bp periodicity in chromosome II, and a 94-bp periodicity in chromosome III were found. In human chromosomes 21 and 22, approximately 167- or 84-bp periodicity was detected along the entire length of these chromosomes. Because the 167-bp is identical to the length of DNA that forms two complete helical turns in nucleosome organization, we speculated that the respective sequences may correspond to arrays of a special compact form of nucleosomes clustered in specific regions of the human chromosomes. This periodic element contained a high frequency of TGG. TGG-rich sequences are known to form a specific subset of folded DNA structures, and therefore, the sequences might have potential to form specific higher order structures related to the clustered occurrence of a specific form of the speculated nucleosomes.  相似文献   

5.
A method of informational decomposition has been developed, allowing one to reveal hidden periodicity in any symbol sequence. The informational decomposition is calculated without conversion of a symbol sequence into a numerical one, which facilitates finding periodicities in a symbol sequence. The method permits introducing an analog of the autocorrelation function of a symbol sequence. The method developed by us has been applied to reveal hidden periodicities in nucleotide and amino acid sequences, as well as in different poetical texts. Hidden periodicity has been detected in various genes, testifying to their quantum structure. The functional and structural role of hidden periodicity is discussed.  相似文献   

6.
Genomic DNA sequences contain a wealth of information about the bendability and curvature of the DNA molecule. For example, the well-known 10-11 bp periodicities within genomes can be attributed to supercoiled structures or wrapping around nucleosomes. Such periodic signals have previously been examined mainly based on mono- or dinucleotide correlations. In this study, we generalize this approach and analyze correlation functions of longer motifs such as tetramers or poly(A) sequences. Periodically placed motifs may indicate regular protein binding or curvature signals. We detected various periodic signals e.g. strong 10-11 bp oscillations of periodically placed poly(A), poly(T) or poly(W) stretches. These observations lead to a new view on the intensively studied 10-11 bp periodicities.  相似文献   

7.
Latent amino acid repeats seem to be widespread in genetic sequences and to reflect their structure, function, and evolution. We have recently identified latent periodicity in more than 150 protein families including protein kinases and various nucleotide-binding proteins. The latent repeats in these families were correlated to their structure and evolution. However, a majority of known protein families were not identified with our latent periodicity search algorithm. The main presumable reason for this was the inability of our techniques to identify periodicities interspersed with insertions and deletions. We designed the new latent periodicity search algorithm, which is capable of taking into account insertions and deletions. As a result, we identified many novel cases of latent periodicity peculiar to protein families. Possible origins of the periodic structure of these families are discussed. Summarizing, we presume that latent periodicity is present in a substantial portion of known protein families. The latent periodicity matrices and the results of Swiss-Prot scans are available from http://bioinf.narod.ru/del/.  相似文献   

8.
A mathematical method has been developed in order to search for latent periodicity in protein amino-acid and other symbolical sequences using dynamic programming and random matrices. The method allows the detection of the latent periodicity with insertions and deletions at positions that are unknown beforehand. The developed method has been applied to search for the periodicity in the amino-acid sequences of several proteins and in the euro/dollar exchange rate since 2001. The presence of a long period with insertions and deletions in amino-acid sequences is shown. The period length of seven amino acids is observed in the proteins that contain supercoiled regions (a coiled-coil structure) as well as of six, five, or more amino acids. The existence of the period length of 6 and 7 days, as well as 24 and 25 h in the analyzed financial time series is observed; note that this periodicity is detectable only for insertions and deletions. The causes that underlie the occurrence of the latent periodicity with insertions and deletions in amino-acid sequences and financial time series are discussed.  相似文献   

9.
The concept of the phase shift of triplet periodicity (TP) was used for searching potential DNA insertions in genes from 17 bacterial genomes. A mathematical algorithm for detection of these insertions has been developed. This approach can detect potential insertions and deletions with lengths that are not multiples of three bases, especially insertions of relatively large DNA fragments (>100 bases). New similarity measure between triplet matrixes was employed to improve the sensitivity for detecting the TP phase shift. Sequences of 17,220 bacterial genes with each consisting of more than 1,200 bases were analyzed, and the presence of a TP phase shift has been shown in ~16% of analysed genes (2,809 genes), which is about 4 times more than that detected in our previous work. We propose that shifts of the TP phase may indicate the shifts of reading frame in genes after insertions of the DNA fragments with lengths that are not multiples of three bases. A relationship between the phase shifts of TP and the frame shifts in genes is discussed.  相似文献   

10.
The nucleotide sequence of the crossover region on genomes of two intertypic (type 3/type 1) poliovirus recombinant, has been determined by the primer extension method. No deletions, insertions or rearrangements have been observed. Identical contiguous sequences, 7 or 11 nucleotides in length, respectively, have been found in two regions of the parental genomes, involved in the recombination.  相似文献   

11.
HindIII-O/N DNA fragments of vaccinia virus (VV) of the LIVP strain were mapped using thirteen restriction endonucleases. Nucleotide sequences of the HindIII-O fragment (1530 bp) as well as of a site of the HindIII-N genome fragment 353 bp in size were determined. Comparison of restriction maps and nucleotide sequences of VV strains (WR and LIVP) demonstrated that DNA of VV LIVP contained % deletions and 2 insertions. "Reliable" short direct repeats were localized and their possible role in formation of DNA deletions was shown. It was suggested that VV endonuclease and DNA-ligase participate in replication and repair processes. Mechanism of formation of variable sequences of viral genomes is discussed.  相似文献   

12.
Insertions and deletions are responsible for gaps in aligned nucleotide sequences, but they have been usually ignored when the number of nucleotide substitutions was estimated. We compared six sets of nuclear and mitochondrial noncoding DNA sequences of primates and obtained the estimates of the evolutionary rate of insertion and deletion. The maximum-parsimony principle was applied to locate insertions and deletions on a given phylogenetic tree. Deletions were about twice as frequent as insertions for nuclear DNA, and single-nucleotide insertions and deletions were the most frequent in all events. The rate of insertion and deletion was found to be rather constant among branches of the phylogenetic tree, and the rate (approximately 2.0/kb/Myr) for mitochondrial DNA was found to be much higher than that (approximately 0.2/kb/Myr) for nuclear DNA. The rates of nucleotide substitution were about 10 times higher than the rate of insertion and deletion for both nuclear and mitochondrial DNA.   相似文献   

13.
DNA sequences contain information about the bendability and native conformation of DNA. For example, a repetition of certain dinucleotides at distances of 10-11bp supports wrapping around nucleosomes and supercoiled structures of bacterial DNA. We analyzed 86 eubacterial genomes, 16 archaea, and six genomes of higher eukaryotes. First, we discuss whether or not the observed periodicities represent indeed bendability signals. This claim is confirmed since: (1) dinucleotide signals are of comparable size to mononucleotide signals, (2) the signals are present in non-coding DNA as well, and (3) repeat masking has only a minor effect on 10-11bp periodicities. Moreover, the periodicities persist up to 150bp, comparable to the nucleosome size. We show that doublet peaks in Caenorhabditis elegans and some prokaryotes can be traced back to long-ranging modulations. In mammalian genomes, we find consistently spectral peaks as observed earlier in human chromosomes 20, 21 and 22. It has been shown in previous studies that archaea have periods of 10bp, whereas eubacteria exhibit 11bp periodicities. These differences reflect different supercoiled states of microbial DNA. Is the period of 10bp an archaeal or a thermophilic feature? This question is addressed by relating periodicities to optimal growth temperatures. It turns out that the archaea Methanopyrus kandleri (t(opt)=80 degrees C) and a Halobacterium strain (t(opt)=42 degrees C) both have longer periods of about 11bp. Eubacterial genomes have consistently periods around 11bp indicative of negative supercoiling.  相似文献   

14.
In plant genomes, the incorporation of DNA segments is not a common method of artificial gene transfer. Nevertheless, various segments of pararetroviruses have been found in plant genomes in recent decades. The rice genome contains a number of segments of endogenous rice tungro bacilliform virus‐like sequences (ERTBVs), many of which are present between AT dinucleotide repeats (ATrs). Comparison of genomic sequences between two closely related rice subspecies, japonica and indica, allowed us to verify the preferential insertion of ERTBVs into ATrs. In addition to ERTBVs, the comparative analyses showed that ATrs occasionally incorporate repeat sequences including transposable elements, and a wide range of other sequences. Besides the known genomic sequences, the insertion sequences also represented DNAs of unclear origins together with ERTBVs, suggesting that ATrs have integrated episomal DNAs that would have been suspended in the nucleus. Such insertion DNAs might be trapped by ATrs in the genome in a host‐dependent manner. Conversely, other simple mono‐ and dinucleotide sequence repeats (SSR) were less frequently involved in insertion events relative to ATrs. Therefore, ATrs could be regarded as hot spots of double‐strand breaks that induce non‐homologous end joining. The insertions within ATrs occasionally generated new gene‐related sequences or involved structural modifications of existing genes. Likewise, in a comparison between Arabidopsis thaliana and Arabidopsis lyrata, the insertions preferred ATrs to other SSRs. Therefore ATrs in plant genomes could be considered as genomic dumping sites that have trapped various DNA molecules and may have exerted a powerful evolutionary force.  相似文献   

15.
Two different views have been proposed for origins of genes (or proteins). One is that primordial genes evolved from random sequences. This view underlies the concept of modern in vitro evolution experiments that functional molecules (even proteins) evolved from random sequence-libraries. On the contrary, the second view reminds that "random sequences" would be an unusual state in which to find RNA or DNA, because it is their inherent nature to yield periodic structures during the course of semi-conservative replication. In this second view, the periodicity of DNA (or RNA) is responsible for emergence of primordial genes. Although recent reports on the variety of periodicities present in proteins, genes and genomes are consistent with the second view, it has yet to be experimentally tested. We assessed the significance of periodicities of DNA in the origin of genes by constructing such periodic DNAs. The results showed that periodic DNA produced ordered proteins at very high rates, which is in contrast to the fact that proteins with random sequences lack secondary structures. We concluded that periodicity played a pivotal role in the origin of many genes. The observation should pave the way for new experimental evolution systems for proteins.  相似文献   

16.
Liu H  Wu J  Xie J  Yang X  Lu Z  Sun X 《Biophysical journal》2008,94(12):4597-4604
By analyzing dinucleotide position-frequency data of yeast nucleosome-bound DNA sequences, dinucleotide periodicities of core DNA sequences were investigated. Within frequency domains, weakly bound dinucleotides (AA, AT, and the combinations AA-TT-TA and AA-TT-TA-AT) present doublet peaks in a periodicity range of 10-11 bp, and strongly bound dinucleotides present a single peak. A time-frequency analysis, based on wavelet transformation, indicated that weakly bound dinucleotides of core DNA sequences were spaced smaller (∼10.3 bp) at the two ends, with larger (∼11.1 bp) spacing in the middle section. The finding was supported by DNA curvature and was prevalent in all core DNA sequences. Therefore, three approaches were developed to predict nucleosome positions. After analyzing a 2200-bp DNA sequence, results indicated that the predictions were feasible; areas near protein-DNA binding sites resulted in periodicity profiles with irregular signals. The effects of five dinucleotide patterns were evaluated, indicating that the AA-TT pattern exhibited better performance. A chromosome-scale prediction demonstrated that periodicity profiles perform better than previously described, with up to 59% accuracy. Based on predictions, nucleosome distributions near the beginning and end of open reading frames were analyzed. Results indicated that the majority of open reading frames’ start and end sites were occupied by nucleosomes.  相似文献   

17.
A comparison of rice chloroplast genomes   总被引:19,自引:0,他引:19       下载免费PDF全文
Tang J  Xia H  Cao M  Zhang X  Zeng W  Hu S  Tong W  Wang J  Wang J  Yu J  Yang H  Zhu L 《Plant physiology》2004,135(1):412-420
Using high quality sequence reads extracted from our whole genome shotgun repository, we assembled two chloroplast genome sequences from two rice (Oryza sativa) varieties, one from 93-11 (a typical indica variety) and the other from PA64S (an indica-like variety with maternal origin of japonica), which are both parental varieties of the super-hybrid rice, LYP9. Based on the patterns of high sequence coverage, we partitioned chloroplast sequence variations into two classes, intravarietal and intersubspecific polymorphisms. Intravarietal polymorphisms refer to variations within 93-11 or PA64S. Intersubspecific polymorphisms were identified by comparing the major genotypes of the two subspecies represented by 93-11 and PA64S, respectively. Some of the minor genotypes occurring as intravarietal polymorphisms in one variety existed as major genotypes in the other subspecific variety, thus giving rise to intersubspecific polymorphisms. In our study, we found that the intersubspecific variations of 93-11 (indica) and PA64S (japonica) chloroplast genomes consisted of 72 single nucleotide polymorphisms and 27 insertions or deletions. The intersubspecific polymorphism rates between 93-11 and PA64S were 0.05% for single nucleotide polymorphisms and 0.02% for insertions or deletions, nearly 8 and 10 times lower than their respective nuclear genomes. Based on the total number of nucleotide substitutions between the two chloroplast genomes, we dated the divergence of indica and japonica chloroplast genomes as occurring approximately 86,000 to 200,000 years ago.  相似文献   

18.
19.
We used the method of Information Decomposition developed by us to identify the latent dinucleotide periodicity regions in bacterial genomes. The number of potential minisatellite sequences obtained at high level of statistical significance was 454. Then we classified the periodicity matrices and obtained 45 classes. We used the other new method developed by us--Modified Profile Analysis--to reveal more periodic sequences in the presence of indels using the classes obtained. The number of sequences found by combination of these two methods was 3949. Most of them cannot be revealed by other methods including dynamic programming and Fourier transformation.  相似文献   

20.
Transposed copies of mitochondrial DNA into the nucleus (numts) are widespread, but to date they have not been described from the Coleoptera (beetles). Here we report the discovery of a numt derived from a mitochondrial ribosomal RNA gene in Australian tiger beetles (genus Rivacindela). The loss of function of the numt was confirmed by high proportion of transversions, numerous noncompensatory substitutions in stem regions, and large deletions in functionally important sequences. Phylogenetic analysis of orthologous numt sequences was performed together with the corresponding mtDNA lineage for a study of origination and establishment of the transposed copies in closely related populations and species. All numt sequences were strongly supported to be monophyletic, indicating a single origin of this element. However, populations were polymorphic for the presence of the numt, and phylogenetic trees based on the numt sequences showed inconsistencies with the corresponding mtDNA phylogeny, suggesting slower processes of fixation compared to the mtDNA sequences. In a side-by-side comparison with their mtDNA sister lineage, the nucleotide substitution rate of 1.66 x 10(-8) substitutions/site/year in the numts was approximately equal to the average rate of mtDNA in this group but substantially higher than previous estimates of neutral nuclear rates in vertebrates. The numt clade was affected by several deletions but no insertions, with estimates of nucleotide loss exceeding the rate of nucleotide substitutions by approximately five times. The young age of the Rivacindela numt clade, their absence in species outside of a narrow lineage of related individuals, and the high rate of deletions suggest that insertions do not persist in this group, which is consistent with the view that comparatively small genomes as those of Coleoptera harbor fewer mitochondrial and other nuclear pseudogenes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号