共查询到20条相似文献,搜索用时 0 毫秒
1.
The information decomposition (ID) method has been used for searching dinucleotide periodicities, including latent ones, in plant genomes. In nucleotide sequences of genomes of various plants from the GenBank database, 14766 sequences with a periodicity of two nucleotides have been found. Classification of the periodicity matrices of the detected DNA sequences has yielded 141 classes of dinucleotide periodicity. Since ID does not detect periodicities with nucleotide deletions or insertions, modified profile analysis (MPA) has been applied to the obtained classes to reveal DNA sequences with dinucleotide periodicities containing nucleotide deletions and insertions. Combined use of ID and MPA has permitted the detection of 80 396 DNA sequences with dinucleotide periodicities in the genomes of various plants. The biological role of dinucleotide periodicity in the detected sequences is discussed. 相似文献
2.
3.
A method is proposed to represent and to analyze complete genome sequences (52 species from procaryotes and eukaryotes), based upon n-gram sequence's frequencies of amino acid pairs (bigrams), separated by a given number of other residues. For each of the species analyzed, it allows us to construct over-abundant and over-deficient occurrence profiles, summarizing amino acid bigram frequencies over the entire genome. The method deals efficiently with a sparseness of statistical representations of individual sequences, and describes every gene sequence in the same way, independently of its length and of the genome sizes. The frequency of over-abundant and over-deficient occurrences of bigrams presents a singular periodicity around 3.5 peptide bonds, suggesting a relation with the alpha helical secondary structure. 相似文献
4.
We introduce a new concept of triplet periodicity class (TPC) and a measure of similarity between such classes. We performed classification of 472288 triplet periodicity (TP) regions found in 578868 genes from 29th release of KEGG databank. Totally 2520 classes were obtained. They contain 94% of 472288 found cases of TP. For 92% of TP regions contained in classes the same linkage of TP to open reading frame (ORF) is observed. For 8% of TP cases we revealed a shift between ORF of a gene and ORF common for majority of genes contained in a TPC. For these 8% of periodic regions the hypothetical amino acid sequences corresponding to ORF built by TPC were made. BLAST program has shown that 2679 hypothetical amino acid sequences have statistically significant similarity with proteins from UniProt databank. We suppose that 8% of TP regions contained in classes possess a mutation originating from ORF shift. Obtained TPCs can be used for identification of genes' coding regions as well as for searching for mutations arisen arising from ORF shift. 相似文献
5.
Wei-min Liu 《Journal of mathematical biology》1993,31(5):487-494
This paper presents a theoretical analysis of an SEIRS epidemiological model. It shows that nonlinearity due to a dose-dependent latent period can cause periodicity through a Hopf bifurcation. 相似文献
6.
Genomes of almost all organisms have been found to exhibit several periodicities, the most prominent one is the three base periodicity. It is more pronounced in the gene coding regions and has been exploited to identify the segments of a genome that code for a protein. The reason for this three base periodicity in the gene-coding region has been attributed to inhomogeneous nucleotide compositions in the three codon positions. However, this reason cannot explain the three base periodicity present at the level of the whole genome where the codon concept is not applicable. Even though the distribution of each nucleotide is uniform at the positions 0(mod 3), 1(mod 3) and 2(mod 3) when the whole genome data is considered, our analysis reveals that the three base periodicity is arising because of higher correlations among the nucleotides separated by three bases. 相似文献
7.
Background
Alternative splicing (AS) contributes significantly to protein diversity, by selectively using different combinations of exons of the same gene under certain circumstances. One particular type of AS is the use of alternative first exons (AFEs), which can have consequences far beyond the fine-tuning of protein functions. For example, AFEs may change the N-termini of proteins and thereby direct them to different cellular compartments. When alternative first exons are distant, they are usually associated with alternative promoters, thereby conferring an extra level of gene expression regulation. However, only few studies have examined the patterns of AFEs, and these analyses were mainly focused on mammalian genomes. Recent studies have shown that AFEs exist in the rice genome, and are regulated in a tissue-specific manner. Our current understanding of AFEs in plants is still limited, including important issues such as their regulation, contribution to protein diversity, and evolutionary conservation. 相似文献8.
Internal repeats in protein sequences have wide-ranging implications for the structure and function of proteins. A keen analysis
of the repeats in protein sequences may help us to better understand the structural organization of proteins and their evolutionary
relations. In this paper, a mathematical method for searching for latent periodicity in protein sequences is developed. Using
this method, we identified simple sequence repeats in the alkaline proteases and found that the sequences could show the same
periodicity as their tertiary structures. This result may help us to reduce difficulties in the study of the relationship
between sequences and their structures. 相似文献
9.
Original spectral-statistical methods were developed to recognize a new type of latent periodicity in DNA, called latent profile periodicity, or latent profility. Searching for latent profility allows the detection of different levels of information coding in genes and local DNA segments. 相似文献
10.
Dinucleotide usage is known to vary in the genomes of organisms. The dinucleotide usage profiles or genome signatures are similar for sequence samples taken from the same genome, but are different for taxonomically distant species. This concept of genome signatures has been used to study several organisms including viruses, to elucidate the signatures of evolutionary processes at the genome level. Genome signatures assume greater importance in the case of host–pathogen interactions, where molecular interactions between the two species take place continuously, and can influence their genomic composition. In this study, analyses of whole genome sequences of the HIV-1 subtype B, a retrovirus that caused global pandemic of AIDS, have been carried out to analyse the variation in genome signatures of the virus from 1983 to 2007. We show statistically significant temporal variations in some dinucleotide patterns highlighting the selective evolution of the dinucleotide profiles of HIV-1 subtype B, possibly a consequence of host specific selection. 相似文献
11.
The genomes of flowering plants vary in size from about 0.1 to over 100 gigabase pairs (Gbp), mostly because of polyploidy and variation in the abundance of repetitive elements in intergenic regions. High-quality sequences of the relatively small genomes of Arabidopsis (0.14 Gbp) and rice (0.4 Gbp) have now been largely completed. The sequencing of plant genomes that have a more representative size (the mean for flowering plant genomes is 5.6 Gbp) has been seen as a daunting task, partly because of their size and partly because of the numerous highly conserved repeats. Nevertheless, creative strategies and powerful new tools have been generated recently in the plant genetics community, so that sequencing large plant genomes is now a realistic possibility. Maize (2.4-2.7 Gbp) will be the first gigabase-size plant genome to be sequenced using these novel approaches. Pilot studies on maize indicate that the new gene-enrichment, gene-finishing and gene-orientation technologies are efficient, robust and comprehensive. These strategies will succeed in sequencing the gene-space of large genome plants, and in locating all of these genes and adjacent sequences on the genetic and physical maps. 相似文献
12.
13.
The large number of ESTs generated for Arabidopsis and rice in recent years now act as an important complement to whole genome sequencing projects. The Arabidopsis Genome Initiative has begun a coordinated effort to sequence the entire genome and, as a result, increasing numbers of large sequence entries can be found in the public databases. In addition, the mitochondrial genome of Arabidopsis has been completely sequenced. Genome sequencing studies and the public sequence databases have begun to influence the direction of diverse areas of research from physiology to evolution. 相似文献
14.
The genome structure of several species of Graminea and Drosophila was investigated by DNA renaturation method. Kinetics of DNA reassociation was studied by direct optical scanning and the data obout Cot curve were analized by an improved computer programm "Finger". Differences between structure DNA animals and plants are shown. Plant genomes have no unique fraction which exists in animal genomes. Slowly reassociating fraction in plants comprises about 20% DNA as compared with more than 60% in animal DNA. An analysis of kinetic complexity indicates that the relative content of the slowly reassociating fraction in the genome both of animal and of plants is much higher than that of the highly repeated DNA fraction. 相似文献
15.
All amino acid sequences derived from 248 prokaryotic genomes, 10 invertebrate genomes (plants and fungi) and 10 vertebrate genomes were analysed by the autocorrelation function of charge sequences. The analysis of the total amino acid sequences derived from the 268 biological genomes showed that a significant periodicity of 28 residues is observable for the vertebrate genomes, but not for the other genomes. When proteins with a charge periodicity of 28 residues (PCP28) were selected from the total proteomes, we found that PCP28 in fact exists in all proteomes, but the number of PCP28 is much larger for the vertebrate proteomes than for the other proteomes. Although excess PCP28 in the vertebrate proteomes are only poorly characterized, a detailed inspection of the databases suggests that most excess PCP28 are nuclear proteins. 相似文献
16.
Periodical organisms, such as bamboos and periodical cicadas, are very famous for their synchronous reproduction. In bamboos and other periodical plants, the synchronicity of mass-flowering and withering has been often reported indicating these species are monocarpic (semelparous) species. Therefore, synchronicity and periodicity are often suspected to be fairly tightly coupled traits in these periodical plants. We investigate the periodicity and synchronicity of Strobilanthes flexicaulis, and a closely related species S. tashiroi on Okinawa Island, Japan. The genus Strobilanthes is known for several periodical species. Based on 32-year observational data, we confirmed that S. flexicaulis is 6-year periodical mass-flowering monocarpic plant. All the flowering plants had died after flowering. In contrast, we found that S. tashiroi is a polycarpic perennial with no mass-flowering from three-year individual tracking. We also surveyed six local populations of S. flexicaulis and found variation in the synchronicity from four highly synchronized populations (>98% of plants flowering in the mass year) to two less synchronized one with 11-47% of plants flowering before and after the mass year. This result might imply that synchrony may be selected for when periodicity is established in monocarpic species. We found the selective advantages for mass-flowering in pollinator activities and predator satiation. The current results suggest that the periodical S. flexicaulis might have evolved periodicity from a non-periodical close relative. The current report should become a key finding for understanding the evolution of periodical plants. 相似文献
17.
Background
In order to find traits or evolutionary relics of the primordial genome (the most primitive nucleic acid genome for earth's life) remained in modern genomes, we have studied the characteristics of dinucleotide frequencies across genomes. As the longer a sequence is, the more probable it would be modified during genome evolution. For that reason, short nucleotide sequences, especially dinucleotides, would have considerable chances to be intact during billions of years of evolution. Consequently, conservation of the genomic profiles of the frequencies of dinucleotides across modern genomes may exist and would be an evolutionary relic of the primordial genome. 相似文献18.
Srividhya KV Alaguraj V Poornima G Kumar D Singh GP Raghavenderan L Katta AV Mehta P Krishnaswamy S 《PloS one》2007,2(11):e1193
Background
Prophages are integrated viral forms in bacterial genomes that have been found to contribute to interstrain genetic variability. Many virulence-associated genes are reported to be prophage encoded. Present computational methods to detect prophages are either by identifying possible essential proteins such as integrases or by an extension of this technique, which involves identifying a region containing proteins similar to those occurring in prophages. These methods suffer due to the problem of low sequence similarity at the protein level, which suggests that a nucleotide based approach could be useful.Methodology
Earlier dinucleotide relative abundance (DRA) have been used to identify regions, which deviate from the neighborhood areas, in genomes. We have used the difference in the dinucleotide relative abundance (DRAD) between the bacterial and prophage DNA to aid location of DNA stretches that could be of prophage origin in bacterial genomes. Prophage sequences which deviate from bacterial regions in their dinucleotide frequencies are detected by scanning bacterial genome sequences. The method was validated using a subset of genomes with prophage data from literature reports. A web interface for prophage scan based on this method is available at http://bicmku.in:8082/prophagedb/dra.html. Two hundred bacterial genomes which do not have annotated prophages have been scanned for prophage regions using this method.Conclusions
The relative dinucleotide distribution difference helps detect prophage regions in genome sequences. The usefulness of this method is seen in the identification of 461 highly probable loci pertaining to prophages which have not been annotated so earlier. This work emphasizes the need to extend the efforts to detect and annotate prophage elements in genome sequences. 相似文献19.
20.
A web server for searching latent periodicity based on the method of modified profile analysis has been developed. This method allows searching latent periodicity in presence of insertions and deletions. During searching process, the periodicity classes are used which were found by us earlier for various groups of organisms. Period length belongs to the range 2-20 nt, not including the triplet periodicity. The results obtained are subjected to various filtration steps to ensure their statistical significance. Availability: The use of web server is free for non-commercial users. No registration is required. URL of the server is http://victoria.biengi.ac.ru/lepscan. Current software version is 1.06. 相似文献