首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
The information decomposition (ID) method has been used for searching dinucleotide periodicities, including latent ones, in plant genomes. In nucleotide sequences of genomes of various plants from the Gen-Bank database, 14 766 sequences with a periodicity of two nucleotides have been found at a high level of statistical significance. Classification of the periodicity matrices of the detected DNA sequences has yielded 141 classes of dinucleotide periodicity. Since ID does not detect periodicities with nucleotide deletions or insertions, modified profile analysis (MPA) has been applied to the obtained classes to reveal DNA sequences with dinucleotide periodicities containing nucleotide deletions and insertions. Combined use of ID and MPA has permitted the detection of 80 396 DNA sequences with dinucleotide periodicities in the genomes of various plants. The biological role of dinucleotide periodicity in the detected sequences is discussed.  相似文献   

3.
The information decomposition (ID) method has been used for searching dinucleotide periodicities, including latent ones, in plant genomes. In nucleotide sequences of genomes of various plants from the GenBank database, 14766 sequences with a periodicity of two nucleotides have been found. Classification of the periodicity matrices of the detected DNA sequences has yielded 141 classes of dinucleotide periodicity. Since ID does not detect periodicities with nucleotide deletions or insertions, modified profile analysis (MPA) has been applied to the obtained classes to reveal DNA sequences with dinucleotide periodicities containing nucleotide deletions and insertions. Combined use of ID and MPA has permitted the detection of 80 396 DNA sequences with dinucleotide periodicities in the genomes of various plants. The biological role of dinucleotide periodicity in the detected sequences is discussed.  相似文献   

4.
Shah K  Krishnamachari A 《Bio Systems》2012,107(3):142-144
Genomes of almost all organisms have been found to exhibit several periodicities, the most prominent one is the three base periodicity. It is more pronounced in the gene coding regions and has been exploited to identify the segments of a genome that code for a protein. The reason for this three base periodicity in the gene-coding region has been attributed to inhomogeneous nucleotide compositions in the three codon positions. However, this reason cannot explain the three base periodicity present at the level of the whole genome where the codon concept is not applicable. Even though the distribution of each nucleotide is uniform at the positions 0(mod 3), 1(mod 3) and 2(mod 3) when the whole genome data is considered, our analysis reveals that the three base periodicity is arising because of higher correlations among the nucleotides separated by three bases.  相似文献   

5.
We demonstrate the use of technology developed for optical mapping to acquire DNA fingerprints from single genomes for the purpose of discrimination and identification of bacteria and viruses. Single genome fingerprinting (SGF) provides not only the size but also the order of the restriction fragments, which adds another dimension to the information that can be used for discrimination. Analysis of single organisms may eliminate the need to culture cells and thereby significantly reduce analysis time. In addition, samples containing mixtures of several organisms can be analyzed. For analysis, cells are embedded in an agarose matrix, lysed, and processed to yield intact DNA. The DNA is then deposited on a derivatized glass substrate. The elongated genome is digested with a restriction enzyme and stained with the intercalating dye YOYO-1. DNA is then quantitatively imaged with a fluorescence microscope and the fragments are sized to an accuracy >or=90% by their fluorescence intensity and contour length. Single genome fingerprints were obtained from pure samples of adenovirus, from bacteriophages lambda and T4 GT7, and from a mixture of the three viral genomes. SGF will enable the fingerprinting of uncultured and unamplified samples and allow rapid identification of microorganisms with applications in forensics, medicine, public health, and environmental microbiology.  相似文献   

6.
Endogenous viral elements in animal genomes   总被引:2,自引:0,他引:2  
Integration into the nuclear genome of germ line cells can lead to vertical inheritance of retroviral genes as host alleles. For other viruses, germ line integration has only rarely been documented. Nonetheless, we identified endogenous viral elements (EVEs) derived from ten non-retroviral families by systematic in silico screening of animal genomes, including the first endogenous representatives of double-stranded RNA, reverse-transcribing DNA, and segmented RNA viruses, and the first endogenous DNA viruses in mammalian genomes. Phylogenetic and genomic analysis of EVEs across multiple host species revealed novel information about the origin and evolution of diverse virus groups. Furthermore, several of the elements identified here encode intact open reading frames or are expressed as mRNA. For one element in the primate lineage, we provide statistically robust evidence for exaptation. Our findings establish that genetic material derived from all known viral genome types and replication strategies can enter the animal germ line, greatly broadening the scope of paleovirological studies and indicating a more significant evolutionary role for gene flow from virus to animal genomes than has previously been recognized.  相似文献   

7.
A method is proposed to represent and to analyze complete genome sequences (52 species from procaryotes and eukaryotes), based upon n-gram sequence's frequencies of amino acid pairs (bigrams), separated by a given number of other residues. For each of the species analyzed, it allows us to construct over-abundant and over-deficient occurrence profiles, summarizing amino acid bigram frequencies over the entire genome. The method deals efficiently with a sparseness of statistical representations of individual sequences, and describes every gene sequence in the same way, independently of its length and of the genome sizes. The frequency of over-abundant and over-deficient occurrences of bigrams presents a singular periodicity around 3.5 peptide bonds, suggesting a relation with the alpha helical secondary structure.  相似文献   

8.
MOTIVATION: Viral genomes tend to code in overlapping reading frames to maximize informational content. This may result in atypical codon bias and particular evolutionary constraints. Due to the fast mutation rate of viruses, there is additional strong evidence for varying selection between intra- and intergenomic regions. The presence of multiple coding regions complicates the concept of K(a)/K(s) ratio, and thus begs for an alternative approach when investigating selection strengths. Building on the paper by McCauley and Hein, we develop a method for annotating a viral genome coding in overlapping reading frames. We introduce an evolutionary model capable of accounting for varying levels of selection along the genome, and incorporate it into our prior single sequence HMM methodology, extending it now to a phylogenetic HMM. Given an alignment of several homologous viruses to a reference sequence, we may thus achieve an annotation both of coding regions as well as selection strengths, allowing us to investigate different selection patterns and hypotheses. RESULTS: We illustrate our method by applying it to a multiple alignment of four HIV2 sequences, as well as of three Hepatitis B sequences. We obtain an annotation of the coding regions, as well as a posterior probability for each site of the strength of selection acting on it. From this we may deduce the average posterior selection acting on the different genes. Whilst we are encouraged to see in HIV2, that the known to be conserved genes gag and pol are indeed annotated as such, we also discover several sites of less stringent negative selection within the env gene. To the best of our knowledge, we are the first to subsequently provide a full selection annotation of the Hepatitis B genome by explicitly modelling the evolution within overlapping reading frames, and not relying on simple K(a)/K(s) ratios.  相似文献   

9.
10.
Nuclear import of viral DNA genomes   总被引:3,自引:0,他引:3  
  相似文献   

11.
A comparative genome analysis on exon-intron distribution profiles is performed for human and mouse genomes to deduce similarities and differences between them. Interestingly, both in human and mouse genomes, the total length in introns and intergenic DNA on each chromosome is significantly correlated to the chromosome size. The results presented provide a framework for understanding the nature and patterns of exon-intron length distributions, the constraints on them and their role in genome design and evolution.  相似文献   

12.
Equal Symbol Fourier Transforms (FTES), characterizing nucleotide periodicity, comprise components of 5-D vectors that define base-repeat properties of a genomic sequence. This report describes a conversion of the FTES signals to a common platform of Shannon information content to facilitate comparisons of periodic data with other measures of information for genes and genomes. The autocorrelation used to compute the discrete FTES formed the basis to define repeating bases in terms of conditional probabilities. We derived a vector equation to express the Shannon information content of a sequence in a way that preserves the distinct specificity of base repeat patterns characterized by FTES vectors. We suggest application of such information vectors to study the structure of information in genes, chromosomes, and genomes by chi(2) comparisons.  相似文献   

13.
All amino acid sequences derived from 248 prokaryotic genomes, 10 invertebrate genomes (plants and fungi) and 10 vertebrate genomes were analysed by the autocorrelation function of charge sequences. The analysis of the total amino acid sequences derived from the 268 biological genomes showed that a significant periodicity of 28 residues is observable for the vertebrate genomes, but not for the other genomes. When proteins with a charge periodicity of 28 residues (PCP28) were selected from the total proteomes, we found that PCP28 in fact exists in all proteomes, but the number of PCP28 is much larger for the vertebrate proteomes than for the other proteomes. Although excess PCP28 in the vertebrate proteomes are only poorly characterized, a detailed inspection of the databases suggests that most excess PCP28 are nuclear proteins.  相似文献   

14.
Comparative analysis of processed pseudogenes in the mouse and human genomes   总被引:16,自引:0,他引:16  
Pseudogenes are important resources in evolutionary and comparative genomics because they provide molecular records of the ancient genes that existed in the genome millions of years ago. We have systematically identified approximately 5000 processed pseudogenes in the mouse genome, and estimated that approximately 60% are lineage specific, created after the mouse and human diverged. In both mouse and human genomes, similar types of genes give rise to many processed pseudogenes. These tend to be housekeeping genes, which are highly expressed in the germ line. Ribosomal-protein genes, in particular, form the largest sub-group. The processed pseudogenes in the mouse occur with a distinctly different chromosomal distribution than LINEs or SINEs - preferentially in GC-poor regions. Finally, the age distribution of mouse-processed pseudogenes closely resembles that of LINEs, in contrast to human, where the age distribution closely follows Alus (SINEs).  相似文献   

15.
Improving gene annotation of complete viral genomes   总被引:4,自引:0,他引:4       下载免费PDF全文
Gene annotation in viruses often relies upon similarity search methods. These methods possess high specificity but some genes may be missed, either those unique to a particular genome or those highly divergent from known homologs. To identify potentially missing viral genes we have analyzed all complete viral genomes currently available in GenBank with a specialized and augmented version of the gene finding program GeneMarkS. In particular, by implementing genome-specific self-training protocols we have better adjusted the GeneMarkS statistical models to sequences of viral genomes. Hundreds of new genes were identified, some in well studied viral genomes. For example, a new gene predicted in the genome of the Epstein–Barr virus was shown to encode a protein similar to α-herpesvirus minor tegument protein UL14 with heat shock functions. Convincing evidence of this similarity was obtained after only 12 PSI-BLAST iterations. In another example, several iterations of PSI-BLAST were required to demonstrate that a gene predicted in the genome of Alcelaphine herpesvirus 1 encodes a BALF1-like protein which is thought to be involved in apoptosis regulation and, potentially, carcinogenesis. New predictions were used to refine annotations of viral genomes in the RefSeq collection curated by the National Center for Biotechnology Information. Importantly, even in those cases where no sequence similarities were detected, GeneMarkS significantly reduced the number of primary targets for experimental characterization by identifying the most probable candidate genes. The new genome annotations were stored in VIOLIN, an interactive database which provides access to similarity search tools for up-to-date analysis of predicted viral proteins.  相似文献   

16.
Site-specific or target-specific mutagenesis of viral DNA genomes, using a selectable marker system is a powerful tool for the analysis of the function of specific regions of large DNA genomes. Through these techniques the construction of vectors capable of delivering vaccines for the prevention of infectious disease in humans and animals is possible.  相似文献   

17.
EB viral genomes in epithelial nasopharyngeal carcinoma cells   总被引:26,自引:0,他引:26  
  相似文献   

18.
Polydnavirus genomes and viral gene functions are atypical for viruses. Polydnaviruses are the only group of viruses with segmented DNA genomes and have an unusual obligate mutualistic association with parasitic Hymenoptera, in which the virus is required for survival of the wasp host and vice versa. The virus replicates asymptomatically in the wasp host but severely disrupts lepidopteran host physiology in the absence of viral DNA replication. It is not surprising then that viral gene expression is divergent in its two insect hosts and that differences in viral gene expression are linked to these divergent functions. Some viral genes are expressed only in the wasp host while other viral genes are expressed only in the lepidopteran host and are presumed to be involved in the disruption of host physiological systems. Our laboratory has described the expression and regulation of a family of viral genes implicated in suppressing the lepidopteran immune system, the cys-motif genes. In conjunction with these studies we have described the physical organization of additional viral gene segments. We have cloned, mapped and begun the sequence analysis of selected viral DNA segments. We have noted that some viral DNA segments are nested and that nested viral DNA segments encode the abundantly expressed, secreted cys-motif genes. Conversely, other viral segments are not nested, encode less abundantly expressed genes and may be targeted intra-cellularly. These results suggest that nesting of segments in polydnavirus genomes may be linked to the levels of gene expression. By extension, the unique, segmented organization of polydnavirus genomes may be associated, in part, with the requirement for divergent levels of viral gene expression in lepidopteran hosts in the absence of viral DNA replication.  相似文献   

19.
The majority of human, animal and plant viral pathogens possess genomes composed of RNA. The strategies evolved for expression and replication of viral RNA genomes can differ significantly from those utilized for expression and replication of host-cell genetic material. Consequently, knowledge of the molecular details of these strategies can lead to a clearer understanding of the origin, evolution and control of viral pathogens. We describe recent progress in identifying important structural and functional domains of the RNA genomes and associated replicative enzymes for two very different viruses: vesicular stomatitis virus, which possesses a single-stranded RNA genome of negative polarity, and wound tumor virus, which contains a genome composed of 12 discrete segments of double-stranded RNA.  相似文献   

20.
Viral quasispecies may contain a subset of minority genomes that reflect those genomic sequences that were dominant at an early phase of quasispecies evolution. Such minority genomes are referred to as memory in viral quasispecies. A memory marker previously characterized in foot-and-mouth disease virus (FMDV) is an internal oligoadenylate tract of variable length that became dominant upon serial plaque-to-plaque transfers of FMDV clones. During large population passages, genomes with internal oligoadenylate were outcompeted by wild-type revertants but remained in the mutant spectra as memory genomes. Here, we report a quantification of relative fitness of several FMDV clones, harboring internal oligoadenylate tracts of different length, and that were retrieved at early or late times (passage number) after implementation of memory. The results show that for any given length range of the oligoadenylate, maintenance in memory resulted in an increase in relative fitness, comparable to the increase undergone by the entire population. The fitness increase is in agreement with the Red Queen hypothesis, and implies a replicative memory mechanism. Thus, permanence of memory genomes may be a source of high fitness variants despite their initial low fitness, and despite having remained hidden in mutant spectra. This reinforces the interest of diagnosing minority genomes during chronic human and animal viral infections.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号