首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
In recent years, the increase in the amounts of available genomic data has made it easier to appreciate the extent by which organisms increase their genetic diversity through horizontally transferred genetic material. Such transfers have the potential to give rise to extremely dynamic genomes where a significant proportion of their coding DNA has been contributed by external sources. Because of the impact of these horizontal transfers on the ecological and pathogenic character of the recipient organisms, methods are continuously sought that are able to computationally determine which of the genes of a given genome are products of transfer events. In this paper, we introduce and discuss a novel computational method for identifying horizontal transfers that relies on a gene's nucleotide composition and obviates the need for knowledge of codon boundaries. In addition to being applicable to individual genes, the method can be easily extended to the case of clusters of horizontally transferred genes. With the help of an extensive and carefully designed set of experiments on 123 archaeal and bacterial genomes, we demonstrate that the new method exhibits significant improvement in sensitivity when compared to previously published approaches. In fact, it achieves an average relative improvement across genomes of between 11 and 41% compared to the Codon Adaptation Index method in distinguishing native from foreign genes. Our method's horizontal gene transfer predictions for 123 microbial genomes are available online at http://cbcsrv.watson.ibm.com/HGT/.  相似文献   

2.
Comparative genomics has revealed that variations in bacterial and archaeal genome DNA sequences cannot be explained by only neutral mutations. Virus resistance and plasmid distribution systems have resulted in changes in bacterial and archaeal genome sequences during evolution. The restriction-modification system, a virus resistance system, leads to avoidance of palindromic DNA sequences in genomes. Clustered, regularly interspaced, short palindromic repeats (CRISPRs) found in genomes represent yet another virus resistance system. Comparative genomics has shown that bacteria and archaea have failed to gain any DNA with GC content higher than the GC content of their chromosomes. Thus, horizontally transferred DNA regions have lower GC content than the host chromosomal DNA does. Some nucleoid-associated proteins bind DNA regions with low GC content and inhibit the expression of genes contained in those regions. This form of gene repression is another type of virus resistance system. On the other hand, bacteria and archaea have used plasmids to gain additional genes. Virus resistance systems influence plasmid distribution. Interestingly, the restriction-modification system and nucleoid-associated protein genes have been distributed via plasmids. Thus, GC content and genomic signatures do not reflect bacterial and archaeal evolutionary relationships.  相似文献   

3.
Acquisition of new genetic material through horizontal gene transfer has been shown to be an important feature in the evolution of many pathogenic bacteria. Changes in the genetic repertoire, occurring through gene acquisition and deletion, are the major events underlying the emergence and evolution of bacterial pathogens. However, horizontal gene transfer across the domains i.e. archaea and bacteria is not so common. In this context, we explore events of horizontal gene transfer between archaea and bacteria. In order to determine whether the acquisition of archaeal genes by lateral gene transfer is an important feature in the evolutionary history of the pathogenic bacteria, we have developed a scheme of stepwise eliminations that identifies archaeal-like genes in various bacterial genomes. We report the presence of 9 genes of archaeal origin in the genomes of various bacteria, a subset of which is also unique to the pathogenic members and are not found in respective non-pathogenic counterparts. We believe that these genes, having been retained in the respective genomes through selective advantage, have key functions in the organism’s biology and may play a role in pathogenesis.  相似文献   

4.
Horizontal gene transfer (HGT) is central to prokaryotic evolution. However, little is known about the “scale” of individual HGT events. In this work, we introduce the first computational framework to help answer the following fundamental question: How often does more than one gene get horizontally transferred in a single HGT event? Our method, called HoMer, uses phylogenetic reconciliation to infer single-gene HGT events across a given set of species/strains, employs several techniques to account for inference error and uncertainty, combines that information with gene order information from extant genomes, and uses statistical analysis to identify candidate horizontal multigene transfers (HMGTs) in both extant and ancestral species/strains. HoMer is highly scalable and can be easily used to infer HMGTs across hundreds of genomes. We apply HoMer to a genome-scale data set of over 22,000 gene families from 103 Aeromonas genomes and identify a large number of plausible HMGTs of various scales at both small and large phylogenetic distances. Analysis of these HMGTs reveals interesting relationships between gene function, phylogenetic distance, and frequency of multigene transfer. Among other insights, we find that 1) the observed relative frequency of HMGT increases as divergence between genomes increases, 2) HMGTs often have conserved gene functions, and 3) rare genes are frequently acquired through HMGT. We also analyze in detail HMGTs involving the zonula occludens toxin and type III secretion systems. By enabling the systematic inference of HMGTs on a large scale, HoMer will facilitate a more accurate and more complete understanding of HGT and microbial evolution.  相似文献   

5.
Feng Gao 《Current Genomics》2014,15(2):104-112
Precise DNA replication is critical for the maintenance of genetic integrity in all organisms. In all three domains of life, DNA replication starts at a specialized locus, termed as the replication origin, oriC or ORI, and its identification is vital to understanding the complex replication process. In bacteria and eukaryotes, replication initiates from single and multiple origins, respectively, while archaea can adopt either of the two modes. The Z-curve method has been successfully used to identify replication origins in genomes of various species, including multiple oriCs in some archaea. Based on the Z-curve method and comparative genomics analysis, we have developed a web-based system, Ori-Finder, for finding oriCs in bacterial genomes with high accuracy. Predicted oriC regions in bacterial genomes are organized into an online database, DoriC. Recently, archaeal oriC regions identified by both in vivo and in silico methods have also been included in the database. Here, we summarize the recent advances of in silico prediction of oriCs in bacterial and archaeal genomes using the Z-curve based method.  相似文献   

6.

Background  

Although there are now about 200 complete bacterial genomes in GenBank, deep bacterial phylogeny remains a difficult problem, due to confounding horizontal gene transfers and other phylogenetic "noise". Previous methods have relied primarily upon biological intuition or manual curation for choosing genomic sequences unlikely to be horizontally transferred, and have given inconsistent phylogenies with poor bootstrap confidence.  相似文献   

7.
Similarity Plot (S-plot) is a Windows-based application for large-scale comparisons and 2-dimensional visualization of compositional similarities between genomic sequences. This application combines 2 approaches widely used in genomics: window analysis of statistical characteristics along genomes and dot-plot visual representation. S-plot is effective in identifying highly similar regions between genomes as well as regions with unusual compositional properties (RUCPs) within a single genome, which may be indicative of horizontal gene transfer or of locus-specific selective forces. We use S-plot to identify regions that may have originated through horizontal gene transfer through a 2-step approach, by first comparing a genomic sequence to itself and, subsequently, comparing it to the genomic sequence of a closely related taxon. Moreover, by comparing these suspect sequences to one another, we can estimate a minimum number of sources for these putative xenologous sequences. We illustrate the uses of S-plot in a comparison involving Escherichia coli K12 and E. coli O157:H7. In O157:H7, we found 145 regions that have most probably originated through horizontal gene transfer. By using S-plot to compare each of these regions with 277 completely sequenced prokaryotic genomes, 1 sequence was found to have similar compositional properties to the Yersinia pseudotuberculosis genome, indicating a transfer from a Yersinia or Yersinia relative. Based upon our analysis of RUCPs in O157:H7, we infer that there were at least 53 sources of horizontally transferred sequences.  相似文献   

8.
The highly specialized genomes of bacterial endosymbionts typically lack one of the major contributors of genomic flux in the free-living microbial world-bacteriophages. This study yields three results that show bacteriophages have, to the contrary, been influential in the genome evolution of the most prevalent bacterial endosymbiont of invertebrates, Wolbachia. First, we show that bacteriophage WO is more widespread in Wolbachia than previously recognized, occurring in at least 89% (35/39) of the sampled genomes. Second, we show through several phylogenetic approaches that bacteriophage WO underwent recent lateral transfers between Wolbachia bacteria that coinfect host cells in the dipteran Drosophila simulans and the hymenopteran Nasonia vitripennis. These two cases, along with a previous report in the lepidopteran Ephestia cautella, support a general mechanism for genetic exchange in endosymbionts--the "intracellular arena" hypothesis--in which genetic material moves horizontally between bacteria that coinfect the same intracellular environment. Third, we show recombination in this bacteriophage; in the region encoding a putative capsid protein, the recombination rate is faster than that of any known recombining genes in the endosymbiont genome. The combination of these three lines of genetic evidence indicates that this bacteriophage is a widespread source of genomic instability in the intracellular bacterium Wolbachia and potentially the invertebrate host. More generally, it is the first bacteriophage implicated in frequent lateral transfer between the genomes of bacterial endosymbionts. Gene transfer by bacteriophages could drive significant evolutionary change in the genomes of intracellular bacteria that are typically considered highly stable and prone to genomic degradation.  相似文献   

9.
In the process of analysing the four available complete archaeal genomes, we have noted that certain regions characterised as 'non-coding' exhibit significant sequence similarity to other protein sequences from Archaea and other species. Using established technology, we have identified a number of potential protein coding regions in these putative 'non-coding' regions. We have detected 524 such cases, of which 113 regions appear to code for proteins present in archaeal or other species, while the remaining 411 regions are mostly start/stop definition conflicts. Of the 113 protein coding regions, only 21 code for proteins with homologues of known function. The number of novel coding sequences identified herein amounts to 1. 5% of the total genome entries, while the conflicting cases represent an additional 5%. The observed differences between the four complete archaeal genomes seem to reflect disparate approaches to genome annotation. Genome sequence collections should be regularly checked to improve gene prediction by sequence similarity and greater effort is required to make gene definitions consistent across related species.  相似文献   

10.
Protein translocation across the prokaryotic plasma membrane occurs at the translocon, an evolutionarily conserved membrane-embedded proteinaceous complex. Together with the core components SecYE, prokaryotic translocons also contain auxilliary proteins, such as SecDF. Alignment of bacterial and archaeal SecDF protein sequences reveals the presence of a similar number of homologous regions within each protein. Moreover, the conserved sequence domains in the archaeal proteins are located in similar positions as their bacterial counterparts. When these domains are, however, compared along Bacteria-Archaea lines, a much lower degree of similarity is detected. In Bacteria, SecDF are thought to modulate the membrane association of SecA, the ATPase that provides the driving force for bacterial protein secretion. As no archaeal version of SecA has been detected, the sequence differences reported here may reflect functional differences between bacterial and archaeal SecDF proteins, and by extension, between the bacterial and archaeal protein translocation processes. Moreover, the apparent absence of SecDF in several completed archaeal genomes suggests that differences may exist in the process of protein translocation within the archaeal domain itself.  相似文献   

11.
Although it is well known that there is no long range colinearity in gene order in bacterial genomes, it is thought that there are several regions that are under strong structural constraints during evolution, in which gene order is extremely conserved. One such region is the str locus, containing the S10-spc-alpha operons. These operons contain genes coding for ribosomal proteins and for a number of housekeeping genes. We compared the organisation of these gene clusters in 111 sequenced prokaryotic genomes (99 bacterial and 12 archaeal genomes). We also compared the organisation to the phylogeny based on 16S ribosomal RNA gene sequences and the sequences of the ribosomal proteins L22, L16 and S14. Our data indicate that there is much variation in gene order and content in these gene clusters, both in bacterial as well as in archaeal genomes. Our data indicate that differential gene loss has occurred on multiple occasions during evolution. We also noted several discrepancies between phylogenetic trees based on 16S rRNA gene sequences and sequences of ribosomal proteins L16, L22 and S14, suggesting that horizontal gene transfer did play a significant role in the evolution of the S10-spc-alpha gene clusters.  相似文献   

12.
The archaeal tailed viruses (arTV), evolutionarily related to tailed double-stranded DNA (dsDNA) bacteriophages of the class Caudoviricetes, represent the most common isolates infecting halophilic archaea. Only a handful of these viruses have been genomically characterized, limiting our appreciation of their ecological impacts and evolution. Here, we present 37 new genomes of haloarchaeal tailed virus isolates, more than doubling the current number of sequenced arTVs. Analysis of all 63 available complete genomes of arTVs, which we propose to classify into 14 new families and 3 orders, suggests ancient divergence of archaeal and bacterial tailed viruses and points to an extensive sharing of genes involved in DNA metabolism and counterdefense mechanisms, illuminating common strategies of virus–host interactions with tailed bacteriophages. Coupling of the comparative genomics with the host range analysis on a broad panel of haloarchaeal species uncovered 4 distinct groups of viral tail fiber adhesins controlling the host range expansion. The survey of metagenomes using viral hallmark genes suggests that the global architecture of the arTV community is shaped through recurrent transfers between different biomes, including hypersaline, marine, and anoxic environments.

Comparative genomics and host range analysis reveals the remarkable diversity and evolution of tailed archaeal viruses of the order Caudoviricetes, which together with their bacterial relatives arguably represent the most abundant and widespread virus group on our planet.  相似文献   

13.
Using a phylogenetic approach, we discovered three putative horizontal transfers between bacterial and archaeal species involving large clusters of genes. One transfer involves an operon of 13 genes, called mbx, wich probably was transferred into the genome of Thermotoga maritima from a species belonging or close to the Pyrococcus genus. The two others implied an operon of six genes, called ech, transferred independently to the genomes of Thermoanaerobacter tengcongensis and Desulfovibrio gigas, from a species belonging or close to the Methanosarcina genus. All these transfers affected operons coding for multisubunit membrane-bound (NiFe) hydrogenases involved in the energy metabolism of the donor genomes. The functionality of the transferred operons has not been experimentally demonstrated for T. maritima, whereas in D. gigas and T. tengcongensis the encoded multisubunit hydrogenase could have a role in energy conservation. This report adds several cases of horizontal gene transfers among hydrogenases already described.Reviewing Editor: Dr. Siv Andersson  相似文献   

14.
Halachev MR  Loman NJ  Pallen MJ 《PloS one》2011,6(12):e28388
Among proteins, orthologs are defined as those that are derived by vertical descent from a single progenitor in the last common ancestor of their host organisms. Our goal is to compute a complete set of protein orthologs derived from all currently available complete bacterial and archaeal genomes. Traditional approaches typically rely on all-against-all BLAST searching which is prohibitively expensive in terms of hardware requirements or computational time (requiring an estimated 18 months or more on a typical server). Here, we present xBASE-Orth, a system for ongoing ortholog annotation, which applies a "divide and conquer" approach and adopts a pragmatic scheme that trades accuracy for speed. Starting at species level, xBASE-Orth carefully constructs and uses pan-genomes as proxies for the full collections of coding sequences at each level as it progressively climbs the taxonomic tree using the previously computed data. This leads to a significant decrease in the number of alignments that need to be performed, which translates into faster computation, making ortholog computation possible on a global scale. Using xBASE-Orth, we analyzed an NCBI collection of 1,288 bacterial and 94 archaeal complete genomes with more than 4 million coding sequences in 5 weeks and predicted more than 700 million ortholog pairs, clustered in 175,531 orthologous groups. We have also identified sets of highly conserved bacterial and archaeal orthologs and in so doing have highlighted anomalies in genome annotation and in the proposed composition of the minimal bacterial genome. In summary, our approach allows for scalable and efficient computation of the bacterial and archaeal ortholog annotations. In addition, due to its hierarchical nature, it is suitable for incorporating novel complete genomes and alternative genome annotations. The computed ortholog data and a continuously evolving set of applications based on it are integrated in the xBASE database, available at http://www.xbase.ac.uk/.  相似文献   

15.
Toxin-antitoxin systems (TAS) are abundant, diverse, horizontally mobile gene modules that encode powerful resistance mechanisms in prokaryotes. We use the comparative-genomic approach to predict a new TAS that consists of a two-gene cassette encoding uncharacterized HicA and HicB proteins. Numerous bacterial and archaeal genomes encode from one to eight HicAB modules which appear to be highly prone to horizontal gene transfer. The HicB protein (COG1598/COG4226) has a partially degraded RNAse H fold, whereas HicA (COG1724) contains a double-stranded RNA-binding domain. The stable combination of these two domains suggests a link to RNA metabolism, possibly, via an RNA interference-type mechanism. In most HicB proteins, the RNAse H-like domain is fused to a DNA-binding domain, either of the ribbon-helix-helix or of the helix-turn-helix class; in other TAS, proteins containing these DNA-binding domains function as antitoxins. Thus, the HicAB module is predicted to be a novel TAS whose mechanism involves RNA-binding and, possibly, cleavage.  相似文献   

16.
The first bacterial genome was sequenced in 1995, and the first archaeal genome in 1996. Soon after these breakthroughs, an exponential rate of genome sequencing was established, with a doubling time of approximately 20 months for bacteria and approximately 34 months for archaea. Comparative analysis of the hundreds of sequenced bacterial and dozens of archaeal genomes leads to several generalizations on the principles of genome organization and evolution. A crucial finding that enables functional characterization of the sequenced genomes and evolutionary reconstruction is that the majority of archaeal and bacterial genes have conserved orthologs in other, often, distant organisms. However, comparative genomics also shows that horizontal gene transfer (HGT) is a dominant force of prokaryotic evolution, along with the loss of genetic material resulting in genome contraction. A crucial component of the prokaryotic world is the mobilome, the enormous collection of viruses, plasmids and other selfish elements, which are in constant exchange with more stable chromosomes and serve as HGT vehicles. Thus, the prokaryotic genome space is a tightly connected, although compartmentalized, network, a novel notion that undermines the ‘Tree of Life’ model of evolution and requires a new conceptual framework and tools for the study of prokaryotic evolution.  相似文献   

17.
Clustered regularly interspaced short palindromic repeats (CRISPR) constitute a bacterial and archaeal adaptive immune system that protect against bacteriophage (phage). Analysis of CRISPR loci reveals the history of phage infections and provides a direct link between phage and their hosts. All current tools for CRISPR identification have been developed to analyse completed genomes and are not well suited to the analysis of metagenomic data sets, where CRISPR loci are difficult to assemble owing to their repetitive structure and population heterogeneity. Here, we introduce a new algorithm, Crass, which is designed to identify and reconstruct CRISPR loci from raw metagenomic data without the need for assembly or prior knowledge of CRISPR in the data set. CRISPR in assembled data are often fragmented across many contigs/scaffolds and do not fully represent the population heterogeneity of CRISPR loci. Crass identified substantially more CRISPR in metagenomes previously analysed using assembly-based approaches. Using Crass, we were able to detect CRISPR that contained spacers with sequence homology to phage in the system, which would not have been identified using other approaches. The increased sensitivity, specificity and speed of Crass will facilitate comprehensive analysis of CRISPRs in metagenomic data sets, increasing our understanding of phage-host interactions and co-evolution within microbial communities.  相似文献   

18.
Insertion and deletion (indel)-based analyses have great potential for rooting the tree of life, but their use has been limited because they require ubiquitous sequences that have not been horizontally/laterally transferred. Very few such sequences exist. Here we describe and demonstrate a new algorithm that can use nonubiquitous sequences for rooting. This algorithm, top-down indel rooting, uses the traditional logical framework of indel rooting, but by considering gene gains and losses in addition to indel gains and losses, it is able to analyze incomplete data sets. The method is demonstrated using theoretical examples and incomplete gene sets. In particular, it is applied to the well-studied Hsp70/MreB indel, a sequence set thought to have been compromised by gene transfers from Firmicutes to archaebacteria. By sequentially assigning all observable character states, including gene absences, to the questionable archaebacterial Hsp70 and MreB sequences, we demonstrate that this gene set robustly excludes the root of the tree of life from the Gram-negative, double-membrane prokaryotes independently of the archaeal character states. There are very few ubiquitous paralog gene sets, and most of them contain compromised data. The ability of top-down rooting to use incomplete and/or compromised gene sets promises to make rooting analyses more robust and to greatly increase the number of useful indel sets.  相似文献   

19.
While lateral transfer is the rule in the evolutionary history of bacterial and archaeal genes, events of transfer from prokaryotes to eukaryotes are rare. Germline-transmitted animal symbionts, such as Wolbachia pipientis, are well placed to participate in such transfers. In a recent issue of Science, Dunning Hotopp et al. identified instances of transfer of Wolbachia DNA to host genomes. It is unknown whether these transfers represent innovation in animal evolution.  相似文献   

20.
Comparative analysis of the complete sequences of seven bacterial and three archaeal genomes leads to the first generalizations of emerging genome-based microbiology. Protein sequences are, generally, highly conserved, with ∼70% of the gene products in bacteria and archaea containing ancient conserved regions. In contrast, there is little conservation of genome organization, except for a few essential operons. The most striking conclusions derived by comparison of multiple genomes from phylogenetically distant species are that the number of universally conserved gene families is very small and that multiple events of horizontal gene transfer and genome fusion are major forces in evolution.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号