首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The recent availability of several archaeal genome sequences has provided a basis for detailed analyses of the frequency, location and phylogeny of archaeal mobile elements. All the known elements fall into two main types, autonomous insertion sequence (IS) elements and the non-autonomous miniature inverted repeat element (MITE)-like elements. Both classes are considered to be mobilized via transposases that are encoded by the IS elements, although mobility has only been demonstrated experimentally for a few elements. The number, and diversity, of the elements differs greatly between the genomes. At one extreme Sulfolobus solfataricus P2 and Halobacterium NRC-1 are very rich in elements while Methanobacterium thermoautotrophicum contains none. The former also show examples of complex clusters of interwoven elements. An analysis of the genomic distribution in S. solfataricus suggests that the putative oriC and terC regions act as barriers for the mobility of both IS and MITE-like elements. Moreover, the very high level of truncated IS elements in the genomes of S. solfataricus, Sulfolobus tokodaii and Thermoplasma volcanium suggests that there may be a cellular mechanism for selectively inactivating IS elements at a point when they become too numerous and disadvantageous for the cell. Phylogenetically, archaeal IS elements are confined to 11 of the 17 known families of bacterial and eukaryal IS elements where some generate distinct subgroups. Finally, DNA viruses, plasmids and DNA fragments can also be inserted into, and excised from, archaeal genomes by means of an integrase-mediated mechanism that has special archaeal characteristics.  相似文献   

2.
A new statistical method associating each trinucleotide with a frame is developed for identifying circular codes. Its sensibility allows the detection of several circular codes in the (protein coding) genes of archaeal genomes. Several properties of these circular codes are described, in particular the lengths of the minimal windows to retrieve the construction frames, a new definition of a parameter for measuring some probabilities of words generated by the circular codes, and the types of nucleotides in the trinucleotide sites. Some biological consequences are presented in Discussion.  相似文献   

3.
Comparisons among the complete genomes of four betanodavirus genotypes   总被引:1,自引:0,他引:1  
Betanodaviruses, the causative agents of viral nervous necrosis in marine fish, have bipartite positive-sense RNA genomes and have been classified (based on analysis of RNA2 sequences) into 4 genotypes: tiger puffer nervous necrosis virus (TPNNV), barfin flounder nervous necrosis virus (BFNNV), striped jack nervous necrosis virus (SJNNV), and redspotted grouper nervous necrosis virus (RGNNV). Full-length genomes of TPNNV and BFNNV were sequenced for the first time in this study. Their sequence data and those of SJNNV and RGNNV retrieved from GenBank were compared in order to investigate the relationships among the 4 genotypes. Between TPNNV and BFNNV, sequence identities were relatively high in RNA1 and encoded Protein A, but were not significantly high in RNA2 or the coat protein (CP). Similarly, between BFNNV and RGNNV, the amino acid sequences of CP were highly similar, but identities of RNA1, RNA2, and Protein A sequences were not especially high. Furthermore, multiple alignment data of the 4 genotypes of RNA2 sequences revealed that the TPNNV and SJNNV sequences have the same sizes of gaps and extra sequences at the same positions. Collectively, these apparent contradictions in sequence identity suggest that betanodavirus genomes have been constructed via complex evolutionary processes.  相似文献   

4.
Complete archaeal genomes were probed for the presence of long (> or = 25 bp) oligonucleotide repeats (words). We detected the presence of many words distributed in tandem with narrow ranges of periodicity (i.e., spacer length between repeats). Similar words were not identified in genomes of non-archaeal species, namely Escherichia coli, Bacillus subtilis, Haemophilus influenzae, Mycoplasma genitalium and Mycoplasma pneumoniae. BLAST similarity searches against the GenBank nucleotide sequence database revealed that these words were archaeal species-specific, indicating that they are of a signature character. Sequence analysis and genome viewing tools showed these repeats to be restricted to non-coding regions. Thus, archaea appear to possess a non-coding genomic signature that is absent in bacterial species. The identification of a species-specific genomic signature would be of great value to archaeal genome mapping, evolutionary studies and analyses of genome complexity.  相似文献   

5.
Raghavan S  Hariharan R  Brahmachari SK 《Gene》2000,242(1-2):275-283
The genomes of Methanococcus jannaschii, Mycoplasma genitalium, Haemophilus influenzae, Archaeoglobus fulgidus, Helicobacter pylori, Treponema pallidum, Borrelia burgdorferri, Rickettsia prowazekeii, Mycobacterium tuberculosis, Methanobacterium thermoautotrophicum, Synechocystis sp. PCC6803, Bacillus subtilis, Chlamydia trachomatis, Pyrococcus horikoshii, Aquifex aeolicus, Mycoplasma pneumoniae and Escherichia coli have been analysed for the presence of polypurine.polypyrimidine tracts, in order to understand their distribution in these genomes. We observed a variation in abundance of such sequences in these bacteria, with the archaeal genomes forming a high-abundance group and the canonical eubacteria forming a low-abundance group. The genomes of M. tuberculosis and A. aeolicus are unique among the organisms analysed here in the abnormal underrepresentation and overrepresentation of polypurine.polypyrimidine, respectively. We also observe a strand bias, i.e., a preferential occurrence of polypurines in coding strands. It varies widely among the bacteria, from the very high bias in M. jannaschii to the slightly inverse bias in the parasitic genomes of T. pallidum and C. trachomatis. The extent of strand bias, however, cannot be explained on the basis of the GC-content of the genome, use of all-purine codons or an excess in the amino acids that are encoded by such codons. The probable causes and effects of this phenomenon are discussed.  相似文献   

6.
7.
8.
Shannon information in the genomes of all completely sequenced prokaryotes and eukaryotes are measured in word lengths of two to ten letters. It is found that in a scale-dependent way, the Shannon information in complete genomes are much greater than that in matching random sequences--thousands of times greater in the case of short words. Furthermore, with the exception of the 14 chromosomes of Plasmodium falciparum, the Shannon information in all available complete genomes belong to a universality class given by an extremely simple formula. The data are consistent with a model for genome growth composed of two main ingredients: random segmental duplications that increase the Shannon information in a scale-independent way, and random point mutations that preferentially reduces the larger-scale Shannon information. The inference drawn from the present study is that the large-scale and coarse-grained growth of genomes was selectively neutral and this suggests an independent corroboration of Kimura's neutral theory of evolution.  相似文献   

9.
Novel protein families in archaean genomes.   总被引:3,自引:2,他引:1       下载免费PDF全文
  相似文献   

10.
Saccharolobus (formerly Sulfolobus) shibatae B12, isolated from a hot spring in Beppu, Japan in 1982, is one of the first hyperthermophilic and acidophilic archaeal species to be discovered. It serves as a natural host to the extensively studied spindle-shaped virus SSV1, a prototype of the Fuselloviridae family. Two additional Sa. shibatae strains, BEU9 and S38A, sensitive to viruses of the families Lipothrixviridae and Portogloboviridae, respectively, have been isolated more recently. However, none of the strains has been fully sequenced, limiting their utility for studies on archaeal biology and virus–host interactions. Here, we present the complete genome sequences of all three Sa. shibatae strains and explore the rich diversity of their integrated mobile genetic elements (MGE), including transposable insertion sequences, integrative and conjugative elements, plasmids, and viruses, some of which were also detected in the extrachromosomal form. Analysis of related MGEs in other Sulfolobales species and patterns of CRISPR spacer targeting revealed a complex network of MGE distributions, involving horizontal spread and relatively frequent host switching by MGEs over large phylogenetic distances, involving species of the genera Saccharolobus, Sulfurisphaera and Acidianus. Furthermore, we characterize a remarkable case of a virus-to-plasmid transition, whereby a fusellovirus has lost the genes encoding for the capsid proteins, while retaining the replication module, effectively becoming a plasmid.  相似文献   

11.
12.
Gene recognition from questionable ORFs in bacterial and archaeal genomes   总被引:1,自引:0,他引:1  
The ORFs of microbial genomes in annotation files are usually classified into two groups: the first corresponds to known genes; whereas the second includes 'putative', 'probable', 'conserved hypothetical', 'hypothetical', 'unknown' and 'predicted' ORFs etc. Since the annotation is not 100% accurate, it is essential to confirm which ORF of the latter group is coding and which is not. Starting from known genes in the former, this paper describes an improved Z curve method to recognize genes in the latter. Ten-fold cross-validation tests show that the average accuracy of the algorithm is greater than 99% for recognizing the known genes in 57 bacterial and archaeal genomes. The method is then applied to recognize genes of the latter group. The likely non-coding ORFs in each of the 57 bacterial or archaeal genomes studied here are recognized and listed at the website http://tubic.tju.edu.cn/ZCURVE_C_html/noncoding.html. The working mechanism of the algorithm has been discussed in details. A computer program, called ZCURVE_C, was written to calculate a coding score called Z-curve score for ORFs in the above 57 bacterial and archaeal genomes. Coding/non-coding is simply determined by the criterion of Z-curve score > 0/ Z-curve score < 0. A website has been set up to provide the service to calculate the Z-curve score. A user may submit the DNA sequence of an ORF to the server at http://tubic.tju.edu.cn/ZCURVE_C/Default.cgi, and the Z-curve score of the ORF is calculated and returned to the user immediately.  相似文献   

13.
Despite extensive interest in the systematics of Pinnipedia, questions remain concerning phylogenetic relationships within the Phocidae or "true" seals. Relationships within the phocids and their placement relative to the remaining pinnipeds and major lineages of arctoid carnivores were examined using a large molecular data set consisting of 12 mitochondrial protein coding genes. Phylogenetic analysis including 15 extant species of the Phocidae, and representatives of the Otariidae, Odobenidae, Ursidae, Mustelidae, Canidae, and Felidae confirmed the monophyletic origins of the Pinnipedia within the Arctoidea. Slightly more support was found for an ursid affinity of the pinnipeds, however, this relationship remains contentious. The Phocidae were placed as the sister group to a common odobenid-otariid clade. Within the family Phocidae, strong support for the traditionally accepted subfamilies Phocinae (northern seals), and Monachinae (southern seals plus monk seals) was found. In contrast to recent suggestions, a monophyletic Monachus was strongly supported and was placed in a deep branching position within the Monachinae. Evidence from sequence divergence under a maximum likelihood model illustrated that the rarely used tribal distinction within the Monachinae are comparable, in terms of evolutionary distance, to accepted tribal distinctions within the Phocinae. In addition, results suggest that Pagophilus should be accepted as a genus within the Phocini. Sequence divergence between Phoca, Pusa, and Halichoerus is minimal, supporting a taxonomic reclassification of the three genera into an emended genus Phoca, without subgeneric distinctions.  相似文献   

14.
Alongside the well-studied membrane spanning helices, alpha-helical transmembrane (TM) proteins contain several functionally and structurally important types of substructures. Here, existing 3D structures of transmembrane proteins have been used to define and study the concept of reentrant regions, i.e. membrane penetrating regions that enter and exit the membrane on the same side. We find that these regions can be divided into three distinct categories based on secondary structure motifs, namely long regions with a helix-coil-helix motif, regions of medium length with the structure helix-coil or coil-helix and regions of short to medium length consisting entirely of irregular secondary structure. The residues situated in reentrant regions are significantly smaller on average compared to other regions and reentrant regions can be detected in the inter-transmembrane loops with an accuracy of approximately 70% based on their amino acid composition. Using TOP-MOD, a novel method for predicting reentrant regions, we have scanned the genomes of Escherichia coli, Saccharomyces cerevisiae and Homo sapiens. The results suggest that more than 10% of transmembrane proteins contain reentrant regions and that the occurrence of reentrant regions increases linearly with the number of transmembrane regions. Reentrant regions seem to be most commonly found in channel proteins and least commonly in signal receptors.  相似文献   

15.

Background  

Detecting new coding sequences (CDSs) in viral genomes can be difficult for several reasons. The typically compact genomes often contain a number of overlapping coding and non-coding functional elements, which can result in unusual patterns of codon usage; conservation between related sequences can be difficult to interpret – especially within overlapping genes; and viruses often employ non-canonical translational mechanisms – e.g. frameshifting, stop codon read-through, leaky-scanning and internal ribosome entry sites – which can conceal potentially coding open reading frames (ORFs).  相似文献   

16.

Background

With the advances in DNA sequencer-based technologies, it has become possible to automate several steps of the genotyping process leading to increased throughput. To efficiently handle the large amounts of genotypic data generated and help with quality control, there is a strong need for a software system that can help with the tracking of samples and capture and management of data at different steps of the process. Such systems, while serving to manage the workflow precisely, also encourage good laboratory practice by standardizing protocols, recording and annotating data from every step of the workflow.

Results

A laboratory information management system (LIMS) has been designed and implemented at the International Crops Research Institute for the Semi-Arid Tropics (ICRISAT) that meets the requirements of a moderately high throughput molecular genotyping facility. The application is designed as modules and is simple to learn and use. The application leads the user through each step of the process from starting an experiment to the storing of output data from the genotype detection step with auto-binning of alleles; thus ensuring that every DNA sample is handled in an identical manner and all the necessary data are captured. The application keeps track of DNA samples and generated data. Data entry into the system is through the use of forms for file uploads. The LIMS provides functions to trace back to the electrophoresis gel files or sample source for any genotypic data and for repeating experiments. The LIMS is being presently used for the capture of high throughput SSR (simple-sequence repeat) genotyping data from the legume (chickpea, groundnut and pigeonpea) and cereal (sorghum and millets) crops of importance in the semi-arid tropics.

Conclusion

A laboratory information management system is available that has been found useful in the management of microsatellite genotype data in a moderately high throughput genotyping laboratory. The application with source code is freely available for academic users and can be downloaded fromhttp://www.icrisat.org/gt-bt/lims/lims.asp.  相似文献   

17.
Vertebrate genomes are mosaics of isochores. On the assumption that marked differences exist in the isochore structure between warm-blooded and cold-blooded animals, variations among vertebrates were previously attributed to adaptation to homeothermy. However, based on the data of coding regions from representatives of extant vertebrates, including a turtle, a crocodile (Archosauromorpha) and a few kinds of snakes (Lepidosauromorpha), it was recently hypothesized that the common ancestors of mammals, birds and extant reptiles already had the "warm-blooded" isochore structure. To test this hypothesis, the nucleotide sequences of alpha-globin genes including non-coding regions (introns) from two snakes, N. kaouthia and E. climacophora, were determined (accession number: AB104824, AB104825). The correlation between the GC contents in the introns and exons of alpha-globin genes from snakes and those from other vertebrates supports the above hypothesis. Similar analysis using data for exons and introns of other genes obtained from the GenBank (Release 131) also support the above hypothesis.  相似文献   

18.
19.
Bread wheat (Triticum aestivum) is an allohexaploid species, consisting of three subgenomes (A, B, and D). To study the molecular evolution of these closely related genomes, we compared the sequence of a 307-kb physical contig covering the high molecular weight (HMW)-glutenin locus from the A genome of durum wheat (Triticum turgidum, AABB) with the orthologous regions from the B genome of the same wheat and the D genome of the diploid wheat Aegilops tauschii (Anderson et al., 2003; Kong et al., 2004). Although gene colinearity appears to be retained, four out of six genes including the two paralogous HMW-glutenin genes are disrupted in the orthologous region of the A genome. Mechanisms involved in gene disruption in the A genome include retroelement insertions, sequence deletions, and mutations causing in-frame stop codons in the coding sequences. Comparative sequence analysis also revealed that sequences in the colinear intergenic regions of these different genomes were generally not conserved. The rapid genome evolution in these regions is attributable mainly to the large number of retrotransposon insertions that occurred after the divergence of the three wheat genomes. Our comparative studies indicate that the B genome diverged prior to the separation of the A and D genomes. Furthermore, sequence comparison of two distinct types of allelic variations at the HMW-glutenin loci in the A genomes of different hexaploid wheat cultivars with the A genome locus of durum wheat indicates that hexaploid wheat may have more than one tetraploid ancestor.  相似文献   

20.
Chen M  Zeng G  Tan Z  Jiang M  Zhang J  Zhang C  Lu L  Lin Y  Peng J 《FEBS letters》2011,585(7):1072-1076
Compound microsatellites consisting of two or more repeats in close proximity have been found in eukaryotic genomes. So far such compound microsatellites have not been investigated in any prokaryotic genomes. We have therefore examined compound microsatellites in 22 complete genomes of Escherichia coli, which is one of the ideal model organisms to analyze the nature and evolution of prokaryotic compound microsatellites. Our results indicated that about 1.75-2.85% of all microsatellites could be accounted as compound microsatellites with very low complexity, and most compound microsatellites were composed of very different motifs. Compound microsatellites were significantly overrepresented in all surveyed genomes. These results were dramatically different from those in eukaryotes. We discussed the possible reasons for the observed divergence.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号