首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 501 毫秒
1.
2.
3.
Genomic variation in the model plant Arabidopsis thaliana has been extensively used to understand evolutionary processes in natural populations, mainly focusing on single-nucleotide polymorphisms. Conversely, structural variation has been largely ignored in spite of its potential to dramatically affect phenotype. Here, we identify 155,440 indels and structural variants ranging in size from 1 bp to 10 kb, including presence/absence variants (PAVs), inversions, and tandem duplications in 1,301 A. thaliana natural accessions from Morocco, Madeira, Europe, Asia, and North America. We show evidence for strong purifying selection on PAVs in genes, in particular for housekeeping genes and homeobox genes, and we find that PAVs are concentrated in defense-related genes (R-genes, secondary metabolites) and F-box genes. This implies the presence of a “core” genome underlying basic cellular processes and a “flexible” genome that includes genes that may be important in spatially or temporally varying selection. Further, we find an excess of intermediate frequency PAVs in defense response genes in nearly all populations studied, consistent with a history of balancing selection on this class of genes. Finally, we find that PAVs in genes involved in the cold requirement for flowering (vernalization) and drought response are strongly associated with temperature at the sites of origin.  相似文献   

4.
Background and Aims Although hybridization can play a positive role in plant evolution, it has been shown that excessive unidirectional hybridization can result in replacement of a species’ gene pool, and even the extinction of rare species via genetic assimilation. This study examines levels of introgression between the common Saxifraga spathularis and its rarer congener S. hirsuta, which have been observed to hybridize in the wild where they occur sympatrically.Methods Seven species-specific single nucleotide polymorphisms (SNPs) were analysed in 1025 plants representing both species and their hybrid, S. × polita, from 29 sites across their ranges in Ireland. In addition, species distribution modelling was carried out to determine whether the relative abundance of the two parental species is likely to change under future climate scenarios.Key Results Saxifraga spathularis individuals tended to be genetically pure, exhibiting little or no introgression from S. hirsuta, but significant levels of introgression of S. spathularis alleles into S. hirsuta were observed, indicating that populations exhibiting S. hirsuta morphology are more like a hybrid swarm, consisting of backcrosses and F2s. Populations of the hybrid, S. × polita, were generally comprised of F1s or F2s, with some evidence of backcrossing. Species distribution modelling under projected future climate scenarios indicated an increase in suitable habitats for both parental species.Conclusions Levels of introgression observed in this study in both S. spathularis and S. hirsuta would appear to be correlated with the relative abundance of the species. Significant introgression of S. spathularis alleles was detected in the majority of the S. hirsuta populations analysed and, consequently, ongoing introgression would appear to represent a threat to the genetic integrity of S. hirsuta, particularly in areas where the species exists sympatrically with its congener and where it is greatly outnumbered.  相似文献   

5.
6.
The Tibetan grey wolf (Canis lupus chanco) occupies habitats on the Qinghai-Tibet Plateau, a high altitude (>3000 m) environment where low oxygen tension exerts unique selection pressure on individuals to adapt to hypoxic conditions. To identify genes involved in hypoxia adaptation, we generated complete genome sequences of nine Chinese wolves from high and low altitude populations at an average coverage of 25× coverage. We found that, beginning about 55,000 years ago, the highland Tibetan grey wolf suffered a more substantial population decline than lowland wolves. Positively selected hypoxia-related genes in highland wolves are enriched in the HIF signaling pathway (P = 1.57E-6), ATP binding (P = 5.62E-5), and response to an oxygen-containing compound (P≤5.30E-4). Of these positively selected hypoxia-related genes, three genes (EPAS1, ANGPT1, and RYR2) had at least one specific fixed non-synonymous SNP in highland wolves based on the nine genome data. Our re-sequencing studies on a large panel of individuals showed a frequency difference greater than 58% between highland and lowland wolves for these specific fixed non-synonymous SNPs and a high degree of LD surrounding the three genes, which imply strong selection. Past studies have shown that EPAS1 and ANGPT1 are important in the response to hypoxic stress, and RYR2 is involved in heart function. These three genes also exhibited significant signals of natural selection in high altitude human populations, which suggest similar evolutionary constraints on natural selection in wolves and humans of the Qinghai-Tibet Plateau.  相似文献   

7.
A nonsense allele at rs1343879 in human MAGEE2 on chromosome X has previously been reported as a strong candidate for positive selection in East Asia. This premature stop codon causing ∼80% protein truncation is characterized by a striking geographical pattern of high population differentiation: common in Asia and the Americas (up to 84% in the 1000 Genomes Project East Asians) but rare elsewhere. Here, we generated a Magee2 mouse knockout mimicking the human loss-of-function mutation to study its functional consequences. The Magee2 null mice did not exhibit gross abnormalities apart from enlarged brain structures (13% increased total brain area, P =0.0022) in hemizygous males. The area of the granular retrosplenial cortex responsible for memory, navigation, and spatial information processing was the most severely affected, exhibiting an enlargement of 34% (P =3.4×10−6). The brain size in homozygous females showed the opposite trend of reduced brain size, although this did not reach statistical significance. With these insights, we performed human association analyses between brain size measurements and rs1343879 genotypes in 141 Chinese volunteers with brain MRI scans, replicating the sexual dimorphism seen in the knockout mouse model. The derived stop gain allele was significantly associated with a larger volume of gray matter in males (P =0.00094), and smaller volumes of gray (P =0.00021) and white (P =0.0015) matter in females. It is unclear whether or not the observed neuroanatomical phenotypes affect behavior or cognition, but it might have been the driving force underlying the positive selection in humans.  相似文献   

8.
9.
Reproductive isolation between lineages is expected to accumulate with divergence time, but the time taken to speciate may strongly vary between different groups of organisms. In anuran amphibians, laboratory crosses can still produce viable hybrid offspring >20 My after separation, but the speed of speciation in closely related anuran lineages under natural conditions is poorly studied. Palearctic green toads (Bufo viridis subgroup) offer an excellent system to address this question, comprising several lineages that arose at different times and form secondary contact zones. Using mitochondrial and nuclear markers, we previously demonstrated that in Sicily, B. siculus and B. balearicus developed advanced reproductive isolation after Plio-Pleistocene divergence (2.6 My, 3.3–1.9), with limited historic mtDNA introgression, scarce nuclear admixture, but low, if any, current gene flow. Here, we study genetic interactions between younger lineages of early Pleistocene divergence (1.9 My, 2.5–1.3) in northeastern Italy (B. balearicus, B. viridis). We find significantly more, asymmetric nuclear and wider, differential mtDNA introgression. The population structure seems to be molded by geographic distance and barriers (rivers), much more than by intrinsic genomic incompatibilities. These differences of hybridization between zones may be partly explained by differences in the duration of previous isolation. Scattered research on other anurans suggests that wide hybrid zones with strong introgression may develop when secondary contacts occur <2 My after divergence, whereas narrower zones with restricted gene flow form when divergence exceeds 3 My. Our study strengthens support for this rule of thumb by comparing lineages with different divergence times within the same radiation.  相似文献   

10.
Polymorphisms in genes that control immune function and regulation may influence susceptibility to pulmonary tuberculosis (TB). In this study, 14 polymorphisms in 12 key genes involved in the immune response (VDR, MR1, TLR1, TLR2, TLR10, SLC11A1, IL1B, IL10, IFNG, TNF, IRAK1, and FOXP3) were tested for their association with pulmonary TB in 271 patients with TB and 251 community-matched controls from the Republic of Moldova. In addition, gene–gene interactions involved in TB susceptibility were analyzed for a total of 43 genetic loci. Single nucleotide polymorphism (SNP) analysis revealed a nominal association between TNF rs1800629 and pulmonary TB (Fisher exact test P = 0.01843). In the pairwise interaction analysis, the combination of the genotypes TLR6 rs5743810 GA and TLR10 rs11096957 GT was significantly associated with an increased genetic risk of pulmonary TB (OR = 2.48, 95% CI = 1.62–3.85; Fisher exact test P value = 1.5 × 10−5, significant after Bonferroni correction). In conclusion, the TLR6 rs5743810 and TLR10 rs11096957 two-locus interaction confers a significantly higher risk for pulmonary TB; due to its high frequency in the population, this SNP combination may serve as a novel biomarker for predicting TB susceptibility.  相似文献   

11.
Allele frequency differences across populations can provide valuable information both for studying population structure and for identifying loci that have been targets of natural selection. Here, we examine the relationship between recombination rate and population differentiation in humans by analyzing two uniformly-ascertained, whole-genome data sets. We find that population differentiation as assessed by inter-continental F ST shows negative correlation with recombination rate, with F ST reduced by 10% in the tenth of the genome with the highest recombination rate compared with the tenth of the genome with the lowest recombination rate (P≪10−12). This pattern cannot be explained by the mutagenic properties of recombination and instead must reflect the impact of selection in the last 100,000 years since human continental populations split. The correlation between recombination rate and F ST has a qualitatively different relationship for F ST between African and non-African populations and for F ST between European and East Asian populations, suggesting varying levels or types of selection in different epochs of human history.  相似文献   

12.
In cultivated tetraploid potato (Solanum tuberosum), reduction to diploidy (dihaploidy) allows for hybridization to diploids and introgression breeding and may facilitate the production of inbreds. Pollination with haploid inducers (HIs) yields maternal dihaploids, as well as triploid and tetraploid hybrids. Dihaploids may result from parthenogenesis, entailing the development of embryos from unfertilized eggs, or genome elimination, entailing missegregation and the loss of paternal chromosomes. A sign of genome elimination is the occasional persistence of HI DNA in some dihaploids. We characterized the genomes of 919 putative dihaploids and 134 hybrids produced by pollinating tetraploid clones with three HIs: IVP35, IVP101, and PL-4. Whole-chromosome or segmental aneuploidy was observed in 76 dihaploids, with karyotypes ranging from 2n = 2x − 1 = 23 to 2n = 2x + 3 = 27. Of the additional chromosomes in 74 aneuploids, 66 were from the non-inducer parent and 8 from the inducer parent. Overall, we detected full or partial chromosomes from the HI parent in 0.87% of the dihaploids, irrespective of parental genotypes. Chromosomal breaks commonly affected the paternal genome in the dihaploid and tetraploid progeny, but not in the triploid progeny, correlating instability to sperm ploidy and to haploid induction. The residual HI DNA discovered in the progeny is consistent with genome elimination as the mechanism of haploid induction.

A large potato progeny population produced by crossing tetraploid cultivated clones to diploid Phureja lines displays rare instances of haploid inducer chromosomes, which are frequently damaged.  相似文献   

13.
Transposable element (TE) amplification has been recognized as a driving force mediating genome size expansion and evolution, but the consequences for shaping 3D genomic architecture remains largely unknown in plants. Here, we report reference-grade genome assemblies for three species of cotton ranging 3-fold in genome size, namely Gossypium rotundifolium (K2), G. arboreum (A2), and G. raimondii (D5), using Oxford Nanopore Technologies. Comparative genome analyses document the details of lineage-specific TE amplification contributing to the large genome size differences (K2, 2.44 Gb; A2, 1.62 Gb; D5, 750.19 Mb) and indicate relatively conserved gene content and synteny relationships among genomes. We found that approximately 17% of syntenic genes exhibit chromatin status change between active (“A”) and inactive (“B”) compartments, and TE amplification was associated with the increase of the proportion of A compartment in gene regions (∼7,000 genes) in K2 and A2 relative to D5. Only 42% of topologically associating domain (TAD) boundaries were conserved among the three genomes. Our data implicate recent amplification of TEs following the formation of lineage-specific TAD boundaries. This study sheds light on the role of transposon-mediated genome expansion in the evolution of higher-order chromatin structure in plants.  相似文献   

14.
Spinach (Spinacia oleracea) is grown as a nutritious leafy vegetable worldwide. To accelerate spinach breeding efficiency, a high-quality reference genome sequence with great completeness and continuity is needed as a basic infrastructure. Here, we used long-read and linked-read technologies to construct a de novo spinach genome assembly, designated SOL_r1.1, which was comprised of 287 scaffolds (total size: 935.7 Mb; N50 = 11.3 Mb) with a low proportion of undetermined nucleotides (Ns = 0.34%) and with high gene completeness (BUSCO complete 96.9%). A genome-wide survey of resistance gene analogues identified 695 genes encoding nucleotide-binding site domains, receptor-like protein kinases, receptor-like proteins and transmembrane-coiled coil domains. Based on a high-density double-digest restriction-site associated DNA sequencing-based linkage map, the genome assembly was anchored to six pseudomolecules representing ∼73.5% of the whole genome assembly. In addition, we used SOL_r1.1 to identify quantitative trait loci for bolting timing and fruit/seed shape, which harbour biologically plausible candidate genes, such as homologues of the FLOWERING LOCUS T and EPIDERMAL PATTERNING FACTOR-LIKE genes. The new genome assembly, SOL_r1.1, will serve as a useful resource for identifying loci associated with important agronomic traits and for developing molecular markers for spinach breeding/selection programs.  相似文献   

15.
Evolvulus alsinoides, belonging to the family Convolvulaceae, is an important medicinal plant widely used as a nootropic in the Indian traditional medicine system. In the genus Evolvulus, no research on the chloroplast genome has been published. Hence, the present study focuses on annotation, characterization, identification of mutational hotspots, and phylogenetic analysis in the complete chloroplast genome (cp) of E. alsinoides. Genome comparison and evolutionary dynamics were performed with the species of Solanales. The cp genome has 114 genes (80 protein-coding genes, 30 transfer RNA, and 4 ribosomal RNA genes) that were unique with total genome size of 157,015 bp. The cp genome possesses 69 RNA editing sites and 44 simple sequence repeats (SSRs). Predicted SSRs were randomly selected and validated experimentally. Six divergent hotspots such as trnQ-UUG, trnF-GAA, psaI, clpP, ndhF, and ycf1 were discovered from the cp genome. These microsatellites and divergent hot spot sequences of the Taxa ‘Evolvulus’ could be employed as molecular markers for species identification and genetic divergence investigations. The LSC area was found to be more conserved than the SSC and IR region in genome comparison. The IR contraction and expansion studies show that nine genes rpl2, rpl23, ycf1, ycf2, ycf1, ndhF, ndhA, matK, and psbK were present in the IR-LSC and IR-SSC boundaries of the cp genome. Fifty-four protein-coding genes in the cp genome were under negative selection pressure, indicating that they were well conserved and were undergoing purifying selection. The phylogenetic analysis reveals that E. alsinoides is closely related to the genus Cressa with some divergence from the genus Ipomoea. This is the first time the chloroplast genome of the genus Evolvulus has been published. The findings of the present study and chloroplast genome data could be a valuable resource for future studies in population genetics, genetic diversity, and evolutionary relationship of the family Convolvulaceae.Supplementary InformationThe online version contains supplementary material available at 10.1007/s12298-021-01051-w.  相似文献   

16.
Campylobacter jejuni CI 120 is a natural isolate obtained during poultry processing and has the ability to induce an acid tolerance response (ATR) to acid + aerobic conditions in early stationary phase. Other strains tested they did not induce an ATR or they induced it in exponential phase. Campylobacter spp. do not contain the genes that encode the global stationary phase stress response mechanism. Therefore, the aim of this study was to identify genes that are involved in the C. jejuni CI 120 early stationary phase ATR, as it seems to be expressing a novel mechanism of stress tolerance. Two-dimensional gel electrophoresis was used to examine the expression profile of cytosolic proteins during the C. jejuni CI 120 adaptation to acid + aerobic stress and microarrays to determine the genes that participate in the ATR. The results indicate induction of a global response that activated a number of stress responses, including several genes encoding surface components and genes involved with iron uptake. The findings of this study provide new insights into stress tolerance of C. jejuni, contribute to a better knowledge of the physiology of this bacterium and highlight the diversity among different strains.  相似文献   

17.
Deep-sea hydrothermal vents resemble the early Earth, and thus the dominant Thermococcaceae inhabitants, which occupy an evolutionarily basal position of the archaeal tree and take an obligate anaerobic hyperthermophilic free-living lifestyle, are likely excellent models to study the evolution of early life. Here, we determined that unbiased mutation rate of a representative species, Thermococcus eurythermalis, exceeded that of all known free-living prokaryotes by 1-2 orders of magnitude, and thus rejected the long-standing hypothesis that low mutation rates were selectively favored in hyperthermophiles. We further sequenced multiple and diverse isolates of this species and calculated that T. eurythermalis has a lower effective population size than other free-living prokaryotes by 1-2 orders of magnitude. These data collectively indicate that the high mutation rate of this species is not selectively favored but instead driven by random genetic drift. The availability of these unusual data also helps explore mechanisms underlying microbial genome size evolution. We showed that genome size is negatively correlated with mutation rate and positively correlated with effective population size across 30 bacterial and archaeal lineages, suggesting that increased mutation rate and random genetic drift are likely two important mechanisms driving microbial genome reduction. Future determinations of the unbiased mutation rate of more representative lineages with highly reduced genomes such as Prochlorococcus and Pelagibacterales that dominate marine microbial communities are essential to test these hypotheses.Subject terms: Archaea, Population genetics

One theory for the origin of life is that the last universal common ancestor was an anaerobic hyperthermophilic organism inhabiting the deep-sea hydrothermal vents, as these environments display a few characteristics paralleling the early Earth [1]. While hydrothermal vents vary with chemical parameters, they all share a high temperature zone near the black chimney with anaerobic fluid from it. In the past decades, great efforts were made to understand the metabolic strategies deep-sea hyperthermophiles use to conserve energy and cope with physicochemical stresses, and to appreciate the molecular mechanisms leading to the stabilization of nucleic acids and proteins at exceedingly high temperatures [2, 3]. However, little is known whether they have a low or high intrinsic (i.e., not selected by environmental pressure) rate to change their genetic background information and whether this intrinsic potential itself is a result of selection shaped by these unique habitats.A previous population genomic analysis showed that protein sequences are under greater functional constraints in thermophiles than in mesophiles, suggesting that mutations are functionally more deleterious in thermophiles than in mesophiles [4]. This explanation is also supported by experimental assays showing nearly neutral mutations in temperate conditions become strongly deleterious at high temperature [5]. Furthermore, fluctuation tests on a hyperthermophilic archeaon Sulfolobus acidocaldarius [6] and a hyperthermophilic bacterium Thermus thermophilus [7] consistently showed that hyperthermophiles have much lower mutation rate compared to mesophiles. This appears to support the hypothesis that selection favors high replication fidelity at high temperature [5].Nevertheless, mutation rates measured using fluctuation experiments based on reporter loci are known to be biased, since the mutation rate of the organism is extrapolated from a few specific nonsynonymous mutations enabling survival in an appropriate selective medium, which renders the results susceptible to uncertainties associated with the representativeness of these loci and to inaccuracies of the assumptions made in extrapolation methods [810]. These limitations are avoided by the mutation accumulation (MA) experiment followed by whole-genome sequencing (WGS) of derived lines. In the MA part, multiple independent MA lines initiated from a single progenitor cell each regularly pass through a single-cell bottleneck, usually by transferring on solid medium. As the effective population size (Ne) becomes one, selection is unable to eliminate all but the lethal mutations, rendering the MA/WGS an approximately unbiased method to measure the spontaneous mutation rate [11].Members of the free-living anaerobic hyperthermophilic archaeal family Thermococcaceae are among the dominant microbial lineages in the black-smoker chimney at Guaymas Basin [12] and other deep-sea hydrothermal vents [13, 14]. This family only contains three genera: Thermococcus, Pyrococcus and Palaeococcus. In this study, the MA/WGS procedure was applied to determine the unbiased spontaneous mutation rate of a representative member Thermococcus eurythermalis A501, a conditional pizeophilic archaeon which can grow equally well from 0.1 MPa to 30 MPa at 85 °C [15, 16]. The MA lines were propagated at this optimal temperature on plates with gelrite which tolerates high temperature, and the experiment was performed under normal air pressure and in strictly anaerobic condition (Fig. 1A–D). To the best of our knowledge, this is the first report of unbiased mutation rate of a hyperthermophile and an obligate anaerobe.Open in a separate windowFig. 1Experimental determination of the unbiased mutation rate of the Thermococcus eurythermalis A501 is challenging because this archaeon has unusual physiology (i.e., obligate anaerobic and obligate hyperthermophilic).A The preparation of anaerobic high temperature tolerant gelrite plate. After sterilization and polysulfide addition via syringe, the plates are made in an anaerobic chamber. B The incubation of the strain T. eurythermalis A501 at 85 °C in liquid medium. C The initiation of mutation accumulation (MA) by spreading cells from a single founding colony to 100 lines. Plates are placed in an anaerobic jar for incubation in strictly anaerobic condition at 85 °C. D The MA process followed by whole-genome sequencing and data analysis. Single colony of each line is transferred to a new plate for N times (here N = 20). E Base-substitution mutations and insertion/deletion mutations across the whole genome of T. eurythermalis. The dashed vertical line separates the chromosome and plasmid. The height of each bar represents the number of base-substitution mutations across all MA lines within 10 kbp window. Green and red triangles denote insertion and deletion, respectively. The locus tags of the 14 genes with statistical enrichment of mutations are shown.Our MA experiment allowed accumulation of mutations over 314 cell divisions (after correcting the death rate (Table S1) [17]) in 100 independent lines initiated from a single founder colony and passed through a single cell bottleneck every day. By sequencing genomes of 96 survived lines at the end of the MA experiment, we identified 544 base-substitution mutations over these lines (Table S2), which translates to an average mutation rate (µ) of 85.01 × 10−10 per cell division per nucleotide site (see Supplementary information). The ratio of accumulated nonsynonymous to synonymous mutations (371 vs 107) did not differ from the ratio of nonsynonymous to synonymous sites (1,485,280 vs 403,070) in the A501 genome (χ2 test; p > 0.05). Likewise, there was no difference of the accumulated mutations between intergenic (65) and protein-coding sites (478) (χ2 test; p > 0.05). These are evidence for minimal selective elimination of deleterious mutations during the MA process. In general, the mutations were randomly distributed along the chromosome and the plasmid, though 86 base-substitution mutations fell into 14 genes which showed significant enrichment of mutations (bootstrap test; p < 0.05 for each gene) and 52 out of the 86 base-substitution mutations were found in five genes (TEU_RS04685 and TEU_RS08625-08640 gene cluster) (Fig. 1E, Table S3). A majority of mutations in these five genes may have inactivated these genes (38 out of 71 in the former gene and 33 out of 43 in the latter gene cluster) either by nonsense mutation or insertion-deletion (INDEL) mutation. The phenomenon of mutation clustering is not unique to this organism; it was reported in another MA study with the yeast Schizosaccharomyces pombe, and these genomic regions may represent either mutational hotspots or that mutations confer selective advantages under experimental conditions [18]. The TEU_RS04685 encodes the beta subunit of the sodium ion-translocating decarboxylase which is an auxiliary pathway for ATP synthesis by generating sodium motive force via decarboxylation [19], and the TEU_RS08625-08640 encodes a putative nucleoside ABC transporter. These genes appear to be important for energy conservation in the highly fluctuating deep-sea hydrothermal fluids. Under the culture conditions in which peptides and amino acids were stably and sufficiently supplied (see the TRM medium recipe in Supplementary information), however, these genes may be dispensable because peptides and amino acids are the preferred carbon and energy sources for T. eurythermalis [15]. On the other hand, some of these genes (e.g., TEU_RS08625) were shown to be upregulated under alkaline stress [16], and thus may be similarly induced under the culture condition in which pH is elevated compared to the vents. Besides, the laboratory condition differed from the vents in a number of other physicochemical features including hydrostatic pressure (0.1 MPa during the MA process versus 20 MPa in situ), temperature and salinity, which likely imposed additional selective pressures on the mutation accumulation processes. Taken together, deleting these genes were likely translated to a net fitness gain and were thus driven by selection. Removing these mutations led to a spontaneous mutation rate of 71.57 × 10−10 per cell division per site for T. eurythermalis A501. After removing the mutations in these 14 genes, both the accumulated mutations at nonsynonymous sites (288) relative to those (104) at synonymous sites (χ2 test; p = 0.014) and the accumulated mutations at intergenic regions (65) relative to protein-coding regions (392) (χ2 test; p = 0.013) showed marginally significant differences.To date, over 20 phylogenetically diverse free-living bacterial species and two archaeal species isolated from various environments have been assayed with MA/WGS, and their mutation rates vary from 0.79 × 10−10 to 97.80 × 10−10 per cell division per site [20]. The only prokaryote that displays a mutation rate (97.80 × 10−10 per cell division per site) comparable to A501 is Mesoplasma florum L1 [21], a host-dependent wall-less bacterium with highly reduced genome (~700 genes). Our PCR validation of randomly chosen 20 base-substitution mutations from two MA lines displaying highest mutation rates and of all nine INDEL mutations involving >10 bp changes across all lines (Table S2) indicates that the calculated high mutation rate did not result from false bioinformatics predictions.The extremely high mutation rate of T. eurythermalis is unexpected. One potential explanation in line with the “mutator theory” [2224] is that high mutation rate may allow the organisms to gain beneficial mutations more rapidly and thus is selectively favored in deep-sea hydrothermal vents where physicochemical parameters are highly fluctuating. Alternatively, high mutation rate is the result of random genetic drift according to the “drift-barrier model” [21]. In this model, increased mutation rates are associated with increased load of deleterious mutations, so natural selection favors lower mutation rates. On the other hand, increased improvements of replication fidelity come at an increased cost of investments in DNA repair activities. Therefore, natural selection pushes the replication fidelity to a level that is set by genetic drift, and further improvements are expected to reduce the fitness advantages [11, 21]. These two explanations for the high mutation rate of T. eurythermalis are mutually exclusive, and resolving them requires the calculation of the power of genetic drift, which is inversely proportional to Ne.A common way to calculate Ne for a prokaryotic population is derived from the equation πS = 2 × Ne × µ, where πS represents the nucleotide diversity at synonymous (silent) sites among randomly sampled members of a panmictic population [25]. We therefore sequenced genomes of another eight T. eurythermalis isolates available in our culture collections. Like T. eurythermalis A501, these additional isolates were collected from the same cruise but varying at the water depth from 1987 m to 2009 m at Guaymas Basin. They differ by only up to 0.135% in the 16S rRNA gene sequence and share a minimum whole-genome average nucleotide identity (ANI) of 95.39% (Table S4), and thus fall within an operationally defined prokaryotic species typically delineated at 95% ANI [26]. Population structure analysis with PopCOGenT [27] showed that these isolates formed a panmictic population and that two of them were repetitive as a result of clonal descent (see Supplementary information). Using the median value of πS = 0.083 across 1628 single-copy orthologous genes shared by the seven non-repetitive genomes, we calculated the Ne of T. eurythermalis to be 5.83 × 106.Next, we collected the unbiased mutation rate of other prokaryotic species determined with the MA/WGS strategy from the literature [11, 2830]. While the Ne data were also provided from those studies, the isolates used to calculate the Ne were identified based on their membership of either an operationally defined species (e.g., ANI at 95% cutoff) or a phenotypically characterized species (e.g., many pathogens), which often create a bias in calculating Ne [25]. We therefore again employed PopCOGenT to delineate panmictic populations from those datasets and re-calculated Ne accordingly. There was a significant negative linear relationship between µ and Ne on a logarithmic scale (dashed gray line in Fig. 2A [r2 = 0.83, slope = −0.85, s.e.m. = 0.09, p < 0.001]) according to a generalized linear model (GLM) regression. This relationship cannot be explained by shared ancestry, as confirmed by phylogenetic generalized least square (PGLS) regression analysis (solid blue line in Fig. 2A [r2 = 0.81, slope = −0.81, s.e.m. = 0.09, p < 0.001]). The nice fit of T. eurythermalis to the regression line validated the drift-barrier hypothesis. This is evidence that the high mutation rate of T. eurythermalis is driven by genetic drift rather than by natural selection.Open in a separate windowFig. 2The scaling relationship involving the base-substitution mutation rate per cell division per site (µ), the estimated effective population size (Ne), and genome size across 28 bacterial and two archaeal species.All three traits’ values were logarithmically transformed. The mutation rates of these species are all determined with the mutation accumulation experiment followed by whole-genome sequencing of the mutant lines. The mutation rate of species numbered 1–29 (blue) is collected from literature and that of the species 30 (red) is determined in the present study. Among the numbered species shown in the figure, the species #6 Haloferax volcanii is facultative anaerobic halophilic archaeon, and the species #30 is an obligate anaerobic hyperthermophilic archaeon. A The scaling relationship between µ and Ne. B The scaling relationship between µ and genome size. C The scaling relationship between genome size and Ne. Numbered data points 21–29 are not shown in A and C because of the lack of population dataset for estimation of Ne. The dashed gray lines and blue lines represent the generalized linear model (GLM) regression and the phylogenetic generalized least square (PGLS) regression, respectively. The Bonferroni adjusted outlier test for the GLM regression show that #7 Janthinobacterium lividum is an outlier in the scaling relationship between µ and Ne, and #9 Mesoplasma florum is an outlier in the scaling relationship between genome size and Ne. No outlier was identified in the PGLS regression results.As stated in the drift-barrier theory, high mutation rate is associated with a high load of deleterious mutations. In the absence of back mutations, recombination becomes an essential mechanism in eliminating deleterious mutations [31]. In support of this argument, the ClonalFrameML analysis [32] shows that members of the T. eurythermalis population recombine frequently, with a high ratio of the frequency of recombination to mutation (ρ/θ = 0.59) and a high ratio of the effect of recombination to mutation (r/m = 5.76). In fact, efficient DNA incorporation to Thermococcaceae genomes from external sources has been well documented experimentally [33, 34]. A second potentially important mechanism facilitating T. eurythermalis adaptation at high temperature is strong purifying selection at the protein sequence level, as protein sequences in thermophiles are generally subjected to stronger functional constraints compared to those in mesophiles [4, 35].Our result of the exceptionally high mutation rate of a free-living archaeon is a significant addition to the available collection of the MA/WGS data (Table S5), in which prokaryotic organisms with very high mutation rate have only been known for a host-dependent bacterium (Mesoplasma florum L1) with unusual biology (e.g., cell wall lacking). The availability of these two deeply branching (one archaeal versus the other bacterial) organisms adopting opposite lifestyles (one free-living versus the other host-restricted; one hyperthermophilic versus the other mesophilic; one obligate anaerobic versus the other facultative anaerobe), along with other phylogenetically and ecologically diverse prokaryotic organisms displaying low and intermediate mutation rates, provides an opportunity to help illustrate mechanisms potentially driving genome size evolution across prokaryotes. We found a negative linear relationship (dashed gray line in Fig. 2B [r2 = 0.49, slope = −1.66, s.e.m. = 0.32, p < 0.001]) between genome size and base-substitution mutation rate, which is consistent with the hypothesis that increased mutation rate drives microbial genome reduction. We also showed a positive linear relationship (dashed gray line in Fig. 2C [r2 = 0.47, slope = 0.24, s.e.m. = 0.06, p < 0.001]) between genome size and Ne, which suggests that random genetic drift drives genome reduction across prokaryotes. These correlations remain robust when the data were analyzed as phylogenetically independent contrasts (blue solid lines in Fig. 2B [r2 = 0.47, slope = −1.75, s.e.m. = 0.34, p < 0.001] and in Fig. 2C [r2 = 0.45, slope = 0.25, s.e.m. = 0.06, p < 0.001]). Our results are consistent with recent studies which employed mathematical modeling and/or comparative sequence analyses to show random genetic drift [36] and increased mutation rate [37] driving genome reduction across diverse bacterial lineages including both free-living and host-dependent bacteria. One benefit of the present study is that it directly measures µ and Ne, as compared to those recent advances which relied on proxies for these metrics (e.g., using the ratio of nonsynonymous substitution rate to synonymous substitution rate to represent Ne) to infer mechanisms of genome reduction.Despite this advantage, there are important caveats to our conclusions related to the mechanisms of genome reduction. The correlation analyses performed here are inspired by Lynch and colleagues’ work, who had great success explaining eukaryotic genome expansion with genetic drift [11, 38]. However, there are a few key differences of genomic features between prokaryotes and eukaryotes, which makes it more difficult to explain the correlation observed in prokaryotes. Importantly, genome sizes of eukaryotes can vary over several orders of magnitude, whereas those of free-living prokaryotes differ by only an order of magnitude [11], so there is much less variability to explain in prokaryotes. Moreover, eukaryotic genomes experience dramatic expansions of transposable elements which are often considered as genomic parasites, whereas prokaryotic genomes including those large ones are usually depleted with transposable elements and their genome size variations are largely driven by gene content [39]. Aside from these conceptual difficulties, the plots (Fig. 2B, C) are poorly populated with typical free-living species carrying small genomes such as the Prochlorococcus (mostly 1.6–8 Mb) and Pelagibacterales (1.3–1.5 Mb), which dominate the photosynthetic and heterotrophic microbial communities, respectively, in the ocean [40]. It has been generally postulated that bacterial species in these lineages have very large Ne [3941], though there has been little direct evidence for it [42, 43]. If confirmed through the measurement of the unbiased mutation rate (µ) followed by the calculation of Ne based on µ, it might compromise the linear relationship between genome size and Ne observed here (Fig. 2C). It is also not necessarily appropriate to translate correlations to causal relationships. For example, the correlation between increased mutation rates and decreased genome sizes (Fig. 2B) does not necessarily mean that increased mutation rate drives genome reduction. This is because high mutation rates are observed in species with small Ne. Given that deletion bias is commonly found in prokaryotes [44, 45], genome reduction can be easily explained by increased fixation of deletional mutations in species with smaller Ne. High mutation rates in these species are simply the result of random genetic drift as explained by the drift-barrier theory, and they may have a limited role in driving genome reduction.Whereas our analysis based on the available data did not support natural selection as a universal mechanism driving genome reduction across prokaryotes (Fig. 2B, C), it does not mean that selection has no role in genome reduction of a particular taxon. In the case of thermophiles, proponents for selection acting to reduce genomes explained that genome size, due to its positive correlation with cell volume, may be an indirect target of selection which strongly favors smaller cell volume [35]. The underlying principle is that high temperature requires cells to increase the lipid content and change the lipid composition of the cell membranes, which consumes a large part of the cellular energy, and thus lower cell volume is selectively favored at high temperature [35]. Our calculations of a relatively small Ne in T. eurythermalis does not necessarily contradict with this selective argument, given that the fitness gained by decreasing cell volume and thus reducing genome size is large enough to overcome the power of random genetic drift. On the other hand, our data strongly indicate that neutral forces dictate the genome evolution of T. eurythermalis, and they are not negligible with regard to its genome reduction process. The significantly more deletion over insertion events (t test; 95 versus 37 events with p < 0.001 and 48 versus 20 events with p < 0.05 before and after removing the 14 genes enriched in mutations, respectively) and the significantly more nucleotides involved in deletions over insertions (t test; 433 versus 138 bp with p < 0.05 and 386 versus 121 bp with p < 0.001 before and after removing the 14 genes enriched in mutations, respectively) suggest that the deletion bias, combined with increased chance fixation of deletion mutants due to low Ne, is a potentially important neutral mechanism giving rise to the small genomes of T. eurythermalis (2.12 Mbp).The globally distributed deep-sea hydrothermal vents are microbe-driven ecosystems, with no known macroorganisms surviving at the vent fluids. Sample collections, microbial isolations, and laboratory propagations of mutation lines at hot and anoxic conditions are challenging. In the present study, we determined that T. eurythermalis, and perhaps Thermococcaceae in general, has a highly increased mutation rate and a highly decreased effective population size compared to all other known free-living prokaryotic lineages. While it remains to be tested whether this is a common feature among the vents’ microbes, the present study nevertheless opens a new avenue for investigating the hyperthemophile ecology and evolution in the deep sea.  相似文献   

18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号