首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The SAR11 clade, here represented by Candidatus Pelagibacter ubique, is the most successful group of bacteria in the upper surface waters of the oceans. In contrast to previous studies that have associated the 1.3 Mb genome of Ca. Pelagibacter ubique with the less than 1.5 Mb genomes of the Rickettsiales, our phylogenetic analysis suggests that Ca. Pelagibacter ubique is most closely related to soil and aquatic Alphaproteobacteria with large genomes. This implies that the SAR11 clade and the Rickettsiales have undergone genome reduction independently. A gene flux analysis of 46 representative alphaproteobacterial genomes indicates the loss of more than 800 genes in each of Ca. Pelagibacter ubique and the Rickettsiales. Consistent with their different phylogenetic affiliations, the pattern of gene loss differs with a higher loss of genes for repair and recombination processes in Ca. Pelagibacter ubique as compared with a more extensive loss of genes for biosynthetic functions in the Rickettsiales. Some of the lost genes in Ca. Pelagibacter ubique, such as mutLS, recFN, and ruvABC, are conserved in all other alphaproteobacterial genomes including the small genomes of the Rickettsiales. The mismatch repair genes mutLS are absent from all currently sequenced SAR11 genomes and also underrepresented in the global ocean metagenome data set. We hypothesize that the unique loss of genes involved in repair and recombination processes in Ca. Pelagibacter ubique has been driven by selection and that this helps explain many of the characteristics of the SAR11 population, such as the streamlined genomes, the long branch lengths, the high recombination frequencies, and the extensive sequence divergence within the population.  相似文献   

2.
Molecular phylogenetics and phylogenomics are subject to noise from horizontal gene transfer (HGT) and bias from convergence in macromolecular compositions. Extensive variation in size, structure and base composition of alphaproteobacterial genomes has complicated their phylogenomics, sparking controversy over the origins and closest relatives of the SAR11 strains. SAR11 are highly abundant, cosmopolitan aquatic Alphaproteobacteria with streamlined, A+T-biased genomes. A dominant view holds that SAR11 are monophyletic and related to both Rickettsiales and the ancestor of mitochondria. Other studies dispute this, finding evidence of a polyphyletic origin of SAR11 with most strains distantly related to Rickettsiales. Although careful evolutionary modeling can reduce bias and noise in phylogenomic inference, entirely different approaches may be useful to extract robust phylogenetic signals from genomes. Here we develop simple phyloclassifiers from bioinformatically derived tRNA Class-Informative Features (CIFs), features predicted to target tRNAs for specific interactions within the tRNA interaction network. Our tRNA CIF-based model robustly and accurately classifies alphaproteobacterial genomes into one of seven undisputed monophyletic orders or families, despite great variability in tRNA gene complement sizes and base compositions. Our model robustly rejects monophyly of SAR11, classifying all but one strain as Rhizobiales with strong statistical support. Yet remarkably, conventional phylogenetic analysis of tRNAs classifies all SAR11 strains identically as Rickettsiales. We attribute this discrepancy to convergence of SAR11 and Rickettsiales tRNA base compositions. Thus, tRNA CIFs appear more robust to compositional convergence than tRNA sequences generally. Our results suggest that tRNA-CIF-based phyloclassification is robust to HGT of components of the tRNA interaction network, such as aminoacyl-tRNA synthetases. We explain why tRNAs are especially advantageous for prediction of traits governing macromolecular interactions from genomic data, and why such traits may be advantageous in the search for robust signals to address difficult problems in classification and phylogeny.  相似文献   

3.

Background

According to the endosymbiont hypothesis, the mitochondrial system for aerobic respiration was derived from an ancestral Alphaproteobacterium. Phylogenetic studies indicate that the mitochondrial ancestor is most closely related to the Rickettsiales. Recently, it was suggested that Candidatus Pelagibacter ubique, a member of the SAR11 clade that is highly abundant in the oceans, is a sister taxon to the mitochondrial-Rickettsiales clade. The availability of ocean metagenome data substantially increases the sampling of Alphaproteobacteria inhabiting the oxygen-containing waters of the oceans that likely resemble the originating environment of mitochondria.

Methodology/Principal Findings

We present a phylogenetic study of the origin of mitochondria that incorporates metagenome data from the Global Ocean Sampling (GOS) expedition. We identify mitochondrially related sequences in the GOS dataset that represent a rare group of Alphaproteobacteria, designated OMAC (Oceanic Mitochondria Affiliated Clade) as the closest free-living relatives to mitochondria in the oceans. In addition, our analyses reject the hypothesis that the mitochondrial system for aerobic respiration is affiliated with that of the SAR11 clade.

Conclusions/Significance

Our results allude to the existence of an alphaproteobacterial clade in the oxygen-rich surface waters of the oceans that represents the closest free-living relative to mitochondria identified thus far. In addition, our findings underscore the importance of expanding the taxonomic diversity in phylogenetic analyses beyond that represented by cultivated bacteria to study the origin of mitochondria.  相似文献   

4.
Strain HIMB100 is a planktonic marine bacterium in the class Alphaproteobacteria. This strain is of interest because it is one of the first known isolates from a globally ubiquitous clade of marine bacteria known as SAR116 within the family Rhodospirillaceae. Here we describe preliminary features of the organism, together with the draft genome sequence and annotation. This is the second genome sequence of a member of the SAR116 clade. The 2,458,945 bp genome contains 2,334 protein-coding and 42 RNA genes.  相似文献   

5.

Background

The SAR11 group of Alphaproteobacteria is highly abundant in the oceans. It contains a recently diverged freshwater clade, which offers the opportunity to compare adaptations to salt- and freshwaters in a monophyletic bacterial group. However, there are no cultivated members of the freshwater SAR11 group and no genomes have been sequenced yet.

Results

We isolated ten single SAR11 cells from three freshwater lakes and sequenced and assembled their genomes. A phylogeny based on 57 proteins indicates that the cells are organized into distinct microclusters. We show that the freshwater genomes have evolved primarily by the accumulation of nucleotide substitutions and that they have among the lowest ratio of recombination to mutation estimated for bacteria. In contrast, members of the marine SAR11 clade have one of the highest ratios. Additional metagenome reads from six lakes confirm low recombination frequencies for the genome overall and reveal lake-specific variations in microcluster abundances. We identify hypervariable regions with gene contents broadly similar to those in the hypervariable regions of the marine isolates, containing genes putatively coding for cell surface molecules.

Conclusions

We conclude that recombination rates differ dramatically in phylogenetic sister groups of the SAR11 clade adapted to freshwater and marine ecosystems. The results suggest that the transition from marine to freshwater systems has purged diversity and resulted in reduced opportunities for recombination with divergent members of the clade. The low recombination frequencies of the LD12 clade resemble the low genetic divergence of host-restricted pathogens that have recently shifted to a new host.  相似文献   

6.
Abundant proteorhodopsin genes in the North Atlantic Ocean   总被引:5,自引:0,他引:5  
Proteorhodopsin (PR) is a light-driven proton pump that has been found in a variety of marine bacteria, including Pelagibacter ubique , a member of the ubiquitous SAR11 clade. The goals of this study were to explore the diversity of PR genes and to estimate their abundance in the North Atlantic Ocean using quantitative polymerase chain reaction (QPCR). We found that PR genes in the western portion of the Sargasso Sea could be grouped into 27 clusters, but five clades had the most sequences. Sets of specific QPCR primers were designed to examine the abundance of PR genes in the following four of the five clades: SAR11 ( P. ubique and other SAR11 Alphaproteobacteria ), BACRED17H8 ( Alphaproteobacteria ), HOT2C01 ( Alphaproteobacteria ) and an uncultured subgroup of the Flavobacteria . Two groups (SAR11 and HOT2C01) dominated PR gene abundance in oligotrophic waters, but were significantly less abundant in nutrient- and chlorophyll-rich waters. The other two groups (BACRED17H8 and Flavobacteria subgroup NASB) were less abundant in all waters. Together, these four PR gene types were found in 50% of all bacteria in the Sargasso Sea. We found a significant negative correlation between total PR gene abundance and nutrients and chlorophyll but no significant correlation with light intensity for three of the four PR types in the depth profiles north of the Sargasso Sea. Our data suggest that PR is common in the North Atlantic Ocean, especially in SAR11 bacteria and another marine alphaproteobacterial group (HOT2C01), and that these PR-bearing bacteria are most abundant in oligotrophic waters.  相似文献   

7.
8.
The ubiquitous SAR11 bacterial clade is the most abundant type of organism in the world's oceans, but the reasons for its success are not fully elucidated. We analysed 128 surface marine metagenomes, including 37 new Antarctic metagenomes. The large size of the data set enabled internal transcribed spacer (ITS) regions to be obtained from the Southern polar region, enabling the first global characterization of the distribution of SAR11, from waters spanning temperatures ?2 to 30°C. Our data show a stable co‐occurrence of phylotypes within both ‘tropical’ (>20°C) and ‘polar’ (<10°C) biomes, highlighting ecological niche differentiation between major SAR11 subgroups. All phylotypes display transitions in abundance that are strongly correlated with temperature and latitude. By assembling SAR11 genomes from Antarctic metagenome data, we identified specific genes, biases in gene functions and signatures of positive selection in the genomes of the polar SAR11—genomic signatures of adaptive radiation. Our data demonstrate the importance of adaptive radiation in the organism's ability to proliferate throughout the world's oceans, and describe genomic traits characteristic of different phylotypes in specific marine biomes.  相似文献   

9.
Bacterioplankton of the SAR11 clade are the most abundant microorganisms in marine systems, usually representing 25% or more of the total bacterial cells in seawater worldwide. SAR11 is divided into subclades with distinct spatiotemporal distributions (ecotypes), some of which appear to be specific to deep water. Here we examine the genomic basis for deep ocean distribution of one SAR11 bathytype (depth-specific ecotype), subclade Ic. Four single-cell Ic genomes, with estimated completeness of 55%–86%, were isolated from 770 m at station ALOHA and compared with eight SAR11 surface genomes and metagenomic datasets. Subclade Ic genomes dominated metagenomic fragment recruitment below the euphotic zone. They had similar COG distributions, high local synteny and shared a large number (69%) of orthologous clusters with SAR11 surface genomes, yet were distinct at the 16S rRNA gene and amino-acid level, and formed a separate, monophyletic group in phylogenetic trees. Subclade Ic genomes were enriched in genes associated with membrane/cell wall/envelope biosynthesis and showed evidence of unique phage defenses. The majority of subclade Ic-specfic genes were hypothetical, and some were highly abundant in deep ocean metagenomic data, potentially masking mechanisms for niche differentiation. However, the evidence suggests these organisms have a similar metabolism to their surface counterparts, and that subclade Ic adaptations to the deep ocean do not involve large variations in gene content, but rather more subtle differences previously observed deep ocean genomic data, like preferential amino-acid substitutions, larger coding regions among SAR11 clade orthologs, larger intergenic regions and larger estimated average genome size.  相似文献   

10.
The SAR11 Alphaproteobacteria are the most abundant heterotrophs in the oceans and are believed to play a major role in mineralizing marine dissolved organic carbon. Their genomes are among the smallest known for free-living heterotrophic cells, raising questions about how they successfully utilize complex organic matter with a limited metabolic repertoire. Here we show that conserved genes in SAR11 subgroup Ia (Candidatus Pelagibacter ubique) genomes encode pathways for the oxidation of a variety of one-carbon compounds and methyl functional groups from methylated compounds. These pathways were predicted to produce energy by tetrahydrofolate (THF)-mediated oxidation, but not to support the net assimilation of biomass from C1 compounds. Measurements of cellular ATP content and the oxidation of (14)C-labeled compounds to (14)CO(2) indicated that methanol, formaldehyde, methylamine, and methyl groups from glycine betaine (GBT), trimethylamine (TMA), trimethylamine N-oxide (TMAO), and dimethylsulfoniopropionate (DMSP) were oxidized by axenic cultures of the SAR11 strain Ca. P. ubique HTCC1062. Analyses of metagenomic data showed that genes for C1 metabolism occur at a high frequency in natural SAR11 populations. In short term incubations, natural communities of Sargasso Sea microbial plankton expressed a potential for the oxidation of (14)C-labeled formate, formaldehyde, methanol and TMAO that was similar to cultured SAR11 cells and, like cultured SAR11 cells, incorporated a much larger percentage of pyruvate and glucose (27-35%) than of C1 compounds (2-6%) into biomass. Collectively, these genomic, cellular and environmental data show a surprising capacity for demethylation and C1 oxidation in SAR11 cultures and in natural microbial communities dominated by SAR11, and support the conclusion that C1 oxidation might be a significant conduit by which dissolved organic carbon is recycled to CO(2) in the upper ocean.  相似文献   

11.
Strain HIMB11 is a planktonic marine bacterium isolated from coastal seawater in Kaneohe Bay, Oahu, Hawaii belonging to the ubiquitous and versatile Roseobacter clade of the alphaproteobacterial family Rhodobacteraceae. Here we describe the preliminary characteristics of strain HIMB11, including annotation of the draft genome sequence and comparative genomic analysis with other members of the Roseobacter lineage. The 3,098,747 bp draft genome is arranged in 34 contigs and contains 3,183 protein-coding genes and 54 RNA genes. Phylogenomic and 16S rRNA gene analyses indicate that HIMB11 represents a unique sublineage within the Roseobacter clade. Comparison with other publicly available genome sequences from members of the Roseobacter lineage reveals that strain HIMB11 has the genomic potential to utilize a wide variety of energy sources (e.g. organic matter, reduced inorganic sulfur, light, carbon monoxide), while possessing a reduced number of substrate transporters.  相似文献   

12.
MOTIVATION: The determination of gene orthology is a prerequisite for mining and utilizing the rapidly increasing amount of sequence data for genome-scale phylogenetics and comparative genomic studies. Until now, most researchers use pairwise distance comparisons algorithms, such as BLAST, COG, RBH, RSD and INPARANOID, to determine gene orthology. In contrast, orthology determination within a character-based phylogenetic framework has not been utilized on a genomic scale owing to the lack of efficiency and automation. RESULTS: We have developed OrthologID, a Web application that automates the labor-intensive procedures of gene orthology determination within a character-based phylogenetic framework, thus making character-based orthology determination on a genomic scale possible. In addition to generating gene family trees and determining orthologous gene sets for complete genomes, OrthologID can also identify diagnostic characters that define each orthologous gene set, as well as diagnostic characters that are responsible for classifying query sequences from other genomes into specific orthology groups. The OrthologID database currently includes several complete plant genomes, including Arabidopsis thaliana, Oryza sativa, Populus trichocarpa, as well as a unicellular outgroup, Chlamydomonas reinhardtii. To improve the general utility of OrthologID beyond plant species, we plan to expand our sequence database to include the fully sequenced genomes of prokaryotes and other non-plant eukaryotes. AVAILABILITY: http://nypg.bio.nyu.edu/orthologid/  相似文献   

13.
Clustering of main orthologs for multiple genomes   总被引:1,自引:0,他引:1  
The identification of orthologous genes shared by multiple genomes is critical for both functional and evolutionary studies in comparative genomics. While it is usually done by sequence similarity search and reconciled tree construction in practice, recently a new combinatorial approach and high-throughput system MSOAR for ortholog identification between closely related genomes based on genome rearrangement and gene duplication has been proposed in Fu et al. MSOAR assumes that orthologous genes correspond to each other in the most parsimonious evolutionary scenario, minimizing the number of genome rearrangement and (postspeciation) gene duplication events. However, the parsimony approach used by MSOAR limits it to pairwise genome comparisons. In this paper, we extend MSOAR to multiple (closely related) genomes and propose an ortholog clustering method, called MultiMSOAR, to infer main orthologs in multiple genomes. As a preliminary experiment, we apply MultiMSOAR to rat, mouse, and human genomes, and validate our results using gene annotations and gene function classifications in the public databases. We further compare our results to the ortholog clusters predicted by MultiParanoid, which is an extension of the well-known program InParanoid for pairwise genome comparisons. The comparison reveals that MultiMSOAR gives more detailed and accurate orthology information, since it can effectively distinguish main orthologs from inparalogs.  相似文献   

14.
Single-stranded DNA (ssDNA) viruses are economically important pathogens of plants and animals, and are widespread in oceans; yet, the diversity and evolutionary relationships among marine ssDNA viruses remain largely unknown. Here we present the results from a metagenomic study of composite samples from temperate (Saanich Inlet, 11 samples; Strait of Georgia, 85 samples) and subtropical (46 samples, Gulf of Mexico) seawater. Most sequences (84%) had no evident similarity to sequenced viruses. In total, 608 putative complete genomes of ssDNA viruses were assembled, almost doubling the number of ssDNA viral genomes in databases. These comprised 129 genetically distinct groups, each represented by at least one complete genome that had no recognizable similarity to each other or to other virus sequences. Given that the seven recognized families of ssDNA viruses have considerable sequence homology within them, this suggests that many of these genetic groups may represent new viral families. Moreover, nearly 70% of the sequences were similar to one of these genomes, indicating that most of the sequences could be assigned to a genetically distinct group. Most sequences fell within 11 well-defined gene groups, each sharing a common gene. Some of these encoded putative replication and coat proteins that had similarity to sequences from viruses infecting eukaryotes, suggesting that these were likely from viruses infecting eukaryotic phytoplankton and zooplankton.  相似文献   

15.
Analyses of 55 individual and 31 concatenated protein data sets encoded in Reclinomonas americana and Marchantia polymorpha mitochondrial genomes revealed that current methods for constructing phylogenetic trees are insufficiently sensitive (or artifact-insensitive) to ascertain the sister of mitochondria among the current sample of eight alpha-proteobacterial genomes using mitochondrially-encoded proteins. However, Rhodospirillum rubrum came as close to mitochondria as any alpha-proteobacterium investigated. This prompted a search for methods to directly compare eukaryotic genomes to their prokaryotic counterparts to investigate the origin of the mitochondrion and its host from the standpoint of nuclear genes. We examined pairwise amino acid sequence identity in comparisons of 6,214 nuclear protein-coding genes from Saccharomyces cerevisiae to 177,117 proteins encoded in sequenced genomes from 45 eubacteria and 15 archaebacteria. The results reveal that approximately 75% of yeast genes having homologues among the present prokaryotic sample share greater amino acid sequence identity to eubacterial than to archaebacterial homologues. At high stringency comparisons, only the eubacterial component of the yeast genome is detectable. Our findings indicate that at the levels of overall amino acid sequence identity and gene content, yeast shares a sister-group relationship with eubacteria, not with archaebacteria, in contrast to the current phylogenetic paradigm based on ribosomal RNA. Among eubacteria and archaebacteria, proteobacterial and methanogen genomes, respectively, shared more similarity with the yeast genome than other prokaryotic genomes surveyed.  相似文献   

16.
Gutless oligochaetes are small marine worms that live in obligate associations with bacterial endosymbionts. While symbionts from several host species belonging to the genus Olavius have been described, little is known of the symbionts from the host genus Inanidrilus. In this study, the diversity of bacterial endosymbionts in Inanidrilus leukodermatus from Bermuda and Inanidrilus makropetalos from the Bahamas was investigated using comparative sequence analysis of the 16S rRNA gene and fluorescence in situ hybridization. As in all other gutless oligochaetes examined to date, I. leukodermatus and I. makropetalos harbor large, oval bacteria identified as Gamma 1 symbionts. The presence of genes coding for ribulose-1,5-bisphosphate carboxylase/oxygenase form I (cbbL) and adenosine 5'-phosphosulfate reductase (aprA) supports earlier studies indicating that these symbionts are chemoautotrophic sulfur oxidizers. Alphaproteobacteria, previously identified only in the gutless oligochaete Olavius loisae from the southwest Pacific Ocean, coexist with the Gamma 1 symbionts in both I. leukodermatus and I. makropetalos, with the former harboring four and the latter two alphaproteobacterial phylotypes. The presence of these symbionts in hosts from such geographically distant oceans as the Atlantic and Pacific suggests that symbioses with alphaproteobacterial symbionts may be widespread in gutless oligochaetes. The high phylogenetic diversity of bacterial endosymbionts in two species of the genus Inanidrilus, previously known only from members of the genus Olavius, shows that the stable coexistence of multiple symbionts is a common feature in gutless oligochaetes.  相似文献   

17.
Genomic trees have been constructed based on the presence and absence of families of protein-encoding genes observed in 27 complete genomes, including genomes of 15 free-living organisms. This method does not rely on the identification of suspected orthologs in each genome, nor the specific alignment used to compare gene sequences because the protein-encoding gene families are formed by grouping any protein with a pairwise similarity score greater than a preset value. Because of this all inclusive grouping, this method is resilient to some effects of lateral gene transfer because transfers of genes are masked when the recipient genome already has a homolog (not necessarily an ortholog) of the incoming gene. Of 71 genes suspected to have been laterally transferred to the genome of Aeropyrum pernix, only approximately 7 to 15 represent genes where a lateral gene transfer appears to have generated homoplasy in our character dataset. The genomic tree of the 15 free-living taxa includes six different bacterial orders, six different archaeal orders, and two different eukaryotic kingdoms. The results are remarkably similar to results obtained by analysis of rRNA. Inclusion of the other 12 genomes resulted in a tree only broadly similar to that suggested by rRNA with at least some of the differences due to artifacts caused by the small genome size of many of these species. Very small genomes, such as those of the two Mycoplasma genomes included, fall to the base of the Bacterial domain, a result expected due to the substantial gene loss inherent to these lineages. Finally, artificial ``partial genomes' were generated by randomly selecting ORFs from the complete genomes in order to test our ability to recover the tree generated by the whole genome sequences when only partial data are available. The results indicated that partial genomic data, when sampled randomly, could robustly recover the tree generated by the whole genome sequences. Received: 30 May 2001 / Accepted: 10 October 2001  相似文献   

18.
DNA-DNA hybridization has been established as an important technology in bacterial species taxonomy and phylogenetic analysis. In this study, we analyzed how the efficiency with which the genomic DNA from one species hybridizes to the genomic DNA of another species (DNA-DNA hybridization) in microarray analysis relates to the similarity between two genomes. We found that the predicted DNA-DNA hybridization based on genome sequence similarity correlated well with the experimentally determined microarray hybridization. Between closely related strains, significant numbers of highly divergent genes (<55% identity) and/or the accumulation of mismatches between conserved genes lowered the DNA-DNA hybridization signal, and this reduced the hybridization signals to below 70% for even bacterial strains with over 97% 16S rRNA gene identity. In addition, our results also suggest that a DNA-DNA hybridization signal intensity of over 40% indicates that two genomes at least shared 30% conserved genes (>60% gene identity). This study may expand our knowledge of DNA-DNA hybridization based on genomic sequence similarity comparison and further provide insights for bacterial phylogeny analyses.  相似文献   

19.
Advances in next-generation sequencing technologies are providing longer nucleotide sequence reads that contain more information about phylogenetic relationships. We sought to use this information to understand the evolution and ecology of bacterioplankton at our long-term study site in the Western Sargasso Sea. A bioinformatics pipeline called PhyloAssigner was developed to align pyrosequencing reads to a reference multiple sequence alignment of 16S ribosomal RNA (rRNA) genes and assign them phylogenetic positions in a reference tree using a maximum likelihood algorithm. Here, we used this pipeline to investigate the ecologically important SAR11 clade of Alphaproteobacteria. A combined set of 2.7 million pyrosequencing reads from the 16S rRNA V1–V2 regions, representing 9 years at the Bermuda Atlantic Time-series Study (BATS) site, was quality checked and parsed into a comprehensive bacterial tree, yielding 929 036 Alphaproteobacteria reads. Phylogenetic structure within the SAR11 clade was linked to seasonally recurring spatiotemporal patterns. This analysis resolved four new SAR11 ecotypes in addition to five others that had been described previously at BATS. The data support a conclusion reached previously that the SAR11 clade diversified by subdivision of niche space in the ocean water column, but the new data reveal a more complex pattern in which deep branches of the clade diversified repeatedly across depth strata and seasonal regimes. The new data also revealed the presence of an unrecognized clade of Alphaproteobacteria, here named SMA-1 (Sargasso Mesopelagic Alphaproteobacteria, group 1), in the upper mesopelagic zone. The high-resolution phylogenetic analyses performed herein highlight significant, previously unknown, patterns of evolutionary diversification, within perhaps the most widely distributed heterotrophic marine bacterial clade, and strongly links to ecosystem regimes.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号