共查询到20条相似文献,搜索用时 15 毫秒
1.
The information decomposition (ID) method has been used for searching dinucleotide periodicities, including latent ones, in plant genomes. In nucleotide sequences of genomes of various plants from the GenBank database, 14766 sequences with a periodicity of two nucleotides have been found. Classification of the periodicity matrices of the detected DNA sequences has yielded 141 classes of dinucleotide periodicity. Since ID does not detect periodicities with nucleotide deletions or insertions, modified profile analysis (MPA) has been applied to the obtained classes to reveal DNA sequences with dinucleotide periodicities containing nucleotide deletions and insertions. Combined use of ID and MPA has permitted the detection of 80 396 DNA sequences with dinucleotide periodicities in the genomes of various plants. The biological role of dinucleotide periodicity in the detected sequences is discussed. 相似文献
2.
3.
A method is proposed to represent and to analyze complete genome sequences (52 species from procaryotes and eukaryotes), based upon n-gram sequence's frequencies of amino acid pairs (bigrams), separated by a given number of other residues. For each of the species analyzed, it allows us to construct over-abundant and over-deficient occurrence profiles, summarizing amino acid bigram frequencies over the entire genome. The method deals efficiently with a sparseness of statistical representations of individual sequences, and describes every gene sequence in the same way, independently of its length and of the genome sizes. The frequency of over-abundant and over-deficient occurrences of bigrams presents a singular periodicity around 3.5 peptide bonds, suggesting a relation with the alpha helical secondary structure. 相似文献
4.
We introduce a new concept of triplet periodicity class (TPC) and a measure of similarity between such classes. We performed classification of 472288 triplet periodicity (TP) regions found in 578868 genes from 29th release of KEGG databank. Totally 2520 classes were obtained. They contain 94% of 472288 found cases of TP. For 92% of TP regions contained in classes the same linkage of TP to open reading frame (ORF) is observed. For 8% of TP cases we revealed a shift between ORF of a gene and ORF common for majority of genes contained in a TPC. For these 8% of periodic regions the hypothetical amino acid sequences corresponding to ORF built by TPC were made. BLAST program has shown that 2679 hypothetical amino acid sequences have statistically significant similarity with proteins from UniProt databank. We suppose that 8% of TP regions contained in classes possess a mutation originating from ORF shift. Obtained TPCs can be used for identification of genes' coding regions as well as for searching for mutations arisen arising from ORF shift. 相似文献
5.
Wei-min Liu 《Journal of mathematical biology》1993,31(5):487-494
This paper presents a theoretical analysis of an SEIRS epidemiological model. It shows that nonlinearity due to a dose-dependent latent period can cause periodicity through a Hopf bifurcation. 相似文献
6.
Background
Alternative splicing (AS) contributes significantly to protein diversity, by selectively using different combinations of exons of the same gene under certain circumstances. One particular type of AS is the use of alternative first exons (AFEs), which can have consequences far beyond the fine-tuning of protein functions. For example, AFEs may change the N-termini of proteins and thereby direct them to different cellular compartments. When alternative first exons are distant, they are usually associated with alternative promoters, thereby conferring an extra level of gene expression regulation. However, only few studies have examined the patterns of AFEs, and these analyses were mainly focused on mammalian genomes. Recent studies have shown that AFEs exist in the rice genome, and are regulated in a tissue-specific manner. Our current understanding of AFEs in plants is still limited, including important issues such as their regulation, contribution to protein diversity, and evolutionary conservation. 相似文献7.
Genomes of almost all organisms have been found to exhibit several periodicities, the most prominent one is the three base periodicity. It is more pronounced in the gene coding regions and has been exploited to identify the segments of a genome that code for a protein. The reason for this three base periodicity in the gene-coding region has been attributed to inhomogeneous nucleotide compositions in the three codon positions. However, this reason cannot explain the three base periodicity present at the level of the whole genome where the codon concept is not applicable. Even though the distribution of each nucleotide is uniform at the positions 0(mod 3), 1(mod 3) and 2(mod 3) when the whole genome data is considered, our analysis reveals that the three base periodicity is arising because of higher correlations among the nucleotides separated by three bases. 相似文献
8.
Internal repeats in protein sequences have wide-ranging implications for the structure and function of proteins. A keen analysis
of the repeats in protein sequences may help us to better understand the structural organization of proteins and their evolutionary
relations. In this paper, a mathematical method for searching for latent periodicity in protein sequences is developed. Using
this method, we identified simple sequence repeats in the alkaline proteases and found that the sequences could show the same
periodicity as their tertiary structures. This result may help us to reduce difficulties in the study of the relationship
between sequences and their structures. 相似文献
9.
Original spectral-statistical methods were developed to recognize a new type of latent periodicity in DNA, called latent profile periodicity, or latent profility. Searching for latent profility allows the detection of different levels of information coding in genes and local DNA segments. 相似文献
10.
The genomes of flowering plants vary in size from about 0.1 to over 100 gigabase pairs (Gbp), mostly because of polyploidy and variation in the abundance of repetitive elements in intergenic regions. High-quality sequences of the relatively small genomes of Arabidopsis (0.14 Gbp) and rice (0.4 Gbp) have now been largely completed. The sequencing of plant genomes that have a more representative size (the mean for flowering plant genomes is 5.6 Gbp) has been seen as a daunting task, partly because of their size and partly because of the numerous highly conserved repeats. Nevertheless, creative strategies and powerful new tools have been generated recently in the plant genetics community, so that sequencing large plant genomes is now a realistic possibility. Maize (2.4-2.7 Gbp) will be the first gigabase-size plant genome to be sequenced using these novel approaches. Pilot studies on maize indicate that the new gene-enrichment, gene-finishing and gene-orientation technologies are efficient, robust and comprehensive. These strategies will succeed in sequencing the gene-space of large genome plants, and in locating all of these genes and adjacent sequences on the genetic and physical maps. 相似文献
11.
12.
The large number of ESTs generated for Arabidopsis and rice in recent years now act as an important complement to whole genome sequencing projects. The Arabidopsis Genome Initiative has begun a coordinated effort to sequence the entire genome and, as a result, increasing numbers of large sequence entries can be found in the public databases. In addition, the mitochondrial genome of Arabidopsis has been completely sequenced. Genome sequencing studies and the public sequence databases have begun to influence the direction of diverse areas of research from physiology to evolution. 相似文献
13.
Dinucleotide usage is known to vary in the genomes of organisms. The dinucleotide usage profiles or genome signatures are similar for sequence samples taken from the same genome, but are different for taxonomically distant species. This concept of genome signatures has been used to study several organisms including viruses, to elucidate the signatures of evolutionary processes at the genome level. Genome signatures assume greater importance in the case of host–pathogen interactions, where molecular interactions between the two species take place continuously, and can influence their genomic composition. In this study, analyses of whole genome sequences of the HIV-1 subtype B, a retrovirus that caused global pandemic of AIDS, have been carried out to analyse the variation in genome signatures of the virus from 1983 to 2007. We show statistically significant temporal variations in some dinucleotide patterns highlighting the selective evolution of the dinucleotide profiles of HIV-1 subtype B, possibly a consequence of host specific selection. 相似文献
14.
Periodical organisms, such as bamboos and periodical cicadas, are very famous for their synchronous reproduction. In bamboos and other periodical plants, the synchronicity of mass-flowering and withering has been often reported indicating these species are monocarpic (semelparous) species. Therefore, synchronicity and periodicity are often suspected to be fairly tightly coupled traits in these periodical plants. We investigate the periodicity and synchronicity of Strobilanthes flexicaulis, and a closely related species S. tashiroi on Okinawa Island, Japan. The genus Strobilanthes is known for several periodical species. Based on 32-year observational data, we confirmed that S. flexicaulis is 6-year periodical mass-flowering monocarpic plant. All the flowering plants had died after flowering. In contrast, we found that S. tashiroi is a polycarpic perennial with no mass-flowering from three-year individual tracking. We also surveyed six local populations of S. flexicaulis and found variation in the synchronicity from four highly synchronized populations (>98% of plants flowering in the mass year) to two less synchronized one with 11-47% of plants flowering before and after the mass year. This result might imply that synchrony may be selected for when periodicity is established in monocarpic species. We found the selective advantages for mass-flowering in pollinator activities and predator satiation. The current results suggest that the periodical S. flexicaulis might have evolved periodicity from a non-periodical close relative. The current report should become a key finding for understanding the evolution of periodical plants. 相似文献
15.
All amino acid sequences derived from 248 prokaryotic genomes, 10 invertebrate genomes (plants and fungi) and 10 vertebrate genomes were analysed by the autocorrelation function of charge sequences. The analysis of the total amino acid sequences derived from the 268 biological genomes showed that a significant periodicity of 28 residues is observable for the vertebrate genomes, but not for the other genomes. When proteins with a charge periodicity of 28 residues (PCP28) were selected from the total proteomes, we found that PCP28 in fact exists in all proteomes, but the number of PCP28 is much larger for the vertebrate proteomes than for the other proteomes. Although excess PCP28 in the vertebrate proteomes are only poorly characterized, a detailed inspection of the databases suggests that most excess PCP28 are nuclear proteins. 相似文献
16.
17.
A web server for searching latent periodicity based on the method of modified profile analysis has been developed. This method allows searching latent periodicity in presence of insertions and deletions. During searching process, the periodicity classes are used which were found by us earlier for various groups of organisms. Period length belongs to the range 2-20 nt, not including the triplet periodicity. The results obtained are subjected to various filtration steps to ensure their statistical significance. Availability: The use of web server is free for non-commercial users. No registration is required. URL of the server is http://victoria.biengi.ac.ru/lepscan. Current software version is 1.06. 相似文献
18.
Turutina VP Laskin AA Kudryashov NA Skryabin KG Korotkov EV 《Biochemistry. Biokhimii?a》2006,71(1):18-31
For detection of the latent periodicity of the protein families responsible for various biological functions, methods of information decomposition, cyclic profile alignment, and the method of noise decomposition have been used. The latent periodicity, being specific to a particular family, is recognized in 94 of 110 analyzed protein families. Family specific periodicity was found for more than 70% of amino acid sequences in each of these families. Based on such sequences the characteristic profile of the latent periodicity has been deduced for each family. Possible relationship between the recognized latent periodicity, evolution of proteins, and their structural organization is discussed. 相似文献
19.
Vera P Turutina Andrew A Laskin Nikolay A Kudryashov Konstantin G Skryabin Eugene V Korotkov 《Journal of computational biology》2006,13(4):946-964
Here, we have applied information decomposition, cyclic profile alignment, and noise decomposition techniques to search for latent repeats within protein families of various functions. We have identified 94 protein families with a family-specific periodicity. In each case, the periodic element was found in greater than 70% of family members. Latent periodicity profiles with specific length and signature were obtained in each case. The possible relationship between the periodic elements thus identified and the evolutionary development of the protein families are discussed with specific reference to the possibility that there is a correlation between the periodic elements and protein function. 相似文献
20.