首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Didelot X  Lawson D  Darling A  Falush D 《Genetics》2010,186(4):1435-1449
Bacteria and archaea reproduce clonally, but sporadically import DNA into their chromosomes from other organisms. In many of these events, the imported DNA replaces an homologous segment in the recipient genome. Here we present a new method to reconstruct the history of recombination events that affected a given sample of bacterial genomes. We introduce a mathematical model that represents both the donor and the recipient of each DNA import as an ancestor of the genomes in the sample. The model represents a simplification of the previously described coalescent with gene conversion. We implement a Monte Carlo Markov chain algorithm to perform inference under this model from sequence data alignments and show that inference is feasible for whole-genome alignments through parallelization. Using simulated data, we demonstrate accurate and reliable identification of individual recombination events and global recombination rate parameters. We applied our approach to an alignment of 13 whole genomes from the Bacillus cereus group. We find, as expected from laboratory experiments, that the recombination rate is higher between closely related organisms and also that the genome contains several broad regions of elevated levels of recombination. Application of the method to the genomic data sets that are becoming available should reveal the evolutionary history and private lives of populations of bacteria and archaea. The methods described in this article have been implemented in a computer software package, ClonalOrigin, which is freely available from http://code.google.com/p/clonalorigin/.  相似文献   

2.
Advances in high-throughput DNA sequencing technologies have determined an explosion in the number of sequenced bacterial genomes. Comparative sequence analysis frequently reveals evidences of homologous recombination occurring with different mechanisms and rates in different species, but the large-scale use of computational methods to identify recombination events is hampered by their high computational costs. Here, we propose a new method to identify recombination events in large datasets of whole genome sequences. Using a filtering procedure of the gene conservation profiles of a test genome against a panel of strains, this algorithm identifies sets of contiguous genes acquired by homologous recombination. The locations of the recombination breakpoints are determined using a statistical test that is able to account for the differences in the natural rate of evolution between different genes. The algorithm was tested on a dataset of 75 genomes of Staphylococcus aureus and 50 genomes comprising different streptococcal species, and was able to detect intra-species recombination events in S. aureus and in Streptococcus pneumoniae. Furthermore, we found evidences of an inter-species exchange of genetic material between S. pneumoniae and Streptococcus mitis, a closely related commensal species that colonizes the same ecological niche. The method has been implemented in an R package, Reco, which is freely available from supplementary material, and provides a rapid screening tool to investigate recombination on a genome-wide scale from sequence data.  相似文献   

3.
Copy-number variations (CNV), loss of heterozygosity (LOH), and uniparental disomy (UPD) are large genomic aberrations leading to many common inherited diseases, cancers, and other complex diseases. An integrated tool to identify these aberrations is essential in understanding diseases and in designing clinical interventions. Previous discovery methods based on whole-genome sequencing (WGS) require very high depth of coverage on the whole genome scale, and are cost-wise inefficient. Another approach, whole exome genome sequencing (WEGS), is limited to discovering variations within exons. Thus, we are lacking efficient methods to detect genomic aberrations on the whole genome scale using next-generation sequencing technology. Here we present a method to identify genome-wide CNV, LOH and UPD for the human genome via selectively sequencing a small portion of genome termed Selected Target Regions (SeTRs). In our experiments, the SeTRs are covered by 99.73%~99.95% with sufficient depth. Our developed bioinformatics pipeline calls genome-wide CNVs with high confidence, revealing 8 credible events of LOH and 3 UPD events larger than 5M from 15 individual samples. We demonstrate that genome-wide CNV, LOH and UPD can be detected using a cost-effective SeTRs sequencing approach, and that LOH and UPD can be identified using just a sample grouping technique, without using a matched sample or familial information.  相似文献   

4.
The use of whole-genome phylogenetic analysis has revolutionized our understanding of the evolution and spread of many important bacterial pathogens due to the high resolution view it provides. However, the majority of such analyses do not consider the potential role of accessory genes when inferring evolutionary trajectories. Moreover, the recently discovered importance of the switching of gene regulatory elements suggests that an exhaustive analysis, combining information from core and accessory genes with regulatory elements could provide unparalleled detail of the evolution of a bacterial population. Here we demonstrate this principle by applying it to a worldwide multi-host sample of the important pathogenic E. coli lineage ST131. Our approach reveals the existence of multiple circulating subtypes of the major drug–resistant clade of ST131 and provides the first ever population level evidence of core genome substitutions in gene regulatory regions associated with the acquisition and maintenance of different accessory genome elements.  相似文献   

5.
Salmonella enterica is a bacterial pathogen that causes enteric fever and gastroenteritis in humans and animals. Although its population structure was long described as clonal, based on high linkage disequilibrium between loci typed by enzyme electrophoresis, recent examination of gene sequences has revealed that recombination plays an important evolutionary role. We sequenced around 10% of the core genome of 114 isolates of enterica using a resequencing microarray. Application of two different analysis methods (Structure and ClonalFrame) to our genomic data allowed us to define five clear lineages within S. enterica subspecies enterica, one of which is five times older than the other four and two thirds of the age of the whole subspecies. We show that some of these lineages display more evidence of recombination than others. We also demonstrate that some level of sexual isolation exists between the lineages, so that recombination has occurred predominantly between members of the same lineage. This pattern of recombination is compatible with expectations from the previously described ecological structuring of the enterica population as well as mechanistic barriers to recombination observed in laboratory experiments. In spite of their relatively low level of genetic differentiation, these lineages might therefore represent incipient species.  相似文献   

6.
Nam K  Ellegren H 《PLoS genetics》2012,8(5):e1002680
Selective and/or neutral processes may govern variation in DNA content and, ultimately, genome size. The observation in several organisms of a negative correlation between recombination rate and intron size could be compatible with a neutral model in which recombination is mutagenic for length changes. We used whole-genome data on small insertions and deletions within transposable elements from chicken and zebra finch to demonstrate clear links between recombination rate and a number of attributes of reduced DNA content. Recombination rate was negatively correlated with the length of introns, transposable elements, and intergenic spacer and with the rate of short insertions. Importantly, it was positively correlated with gene density, the rate of short deletions, the deletion bias, and the net change in sequence length. All these observations point at a pattern of more condensed genome structure in regions of high recombination. Based on the observed rates of small insertions and deletions and assuming that these rates are representative for the whole genome, we estimate that the genome of the most recent common ancestor of birds and lizards has lost nearly 20% of its DNA content up until the present. Expansion of transposable elements can counteract the effect of deletions in an equilibrium mutation model; however, since the activity of transposable elements has been low in the avian lineage, the deletion bias is likely to have had a significant effect on genome size evolution in dinosaurs and birds, contributing to the maintenance of a small genome. We also demonstrate that most of the observed correlations between recombination rate and genome contraction parameters are seen in the human genome, including for segregating indel polymorphisms. Our data are compatible with a neutral model in which recombination drives vertebrate genome size evolution and gives no direct support for a role of natural selection in this process.  相似文献   

7.
When speciation events occur in rapid succession, incomplete lineage sorting (ILS) can cause disagreement among individual gene trees. The probability that ILS affects a given locus is directly related to its effective population size (Ne), which in turn is proportional to the recombination rate if there is strong selection across the genome. Based on these expectations, we hypothesized that low‐recombination regions of the genome, as well as sex chromosomes and nonrecombining chromosomes, should exhibit lower levels of ILS. We tested this hypothesis in phylogenomic datasets from primates, the Drosophila melanogaster clade, and the Drosophila simulans clade. In all three cases, regions of the genome with low or no recombination showed significantly stronger support for the putative species tree, although results from the X chromosome differed among clades. Our results suggest that recurrent selection is acting in these low‐recombination regions, such that current levels of diversity also reflect past decreases in the effective population size at these same loci. The results also demonstrate how considering the genomic context of a gene tree can assist in more accurate determination of the true species phylogeny, especially in cases where a whole‐genome phylogeny appears to be an unresolvable polytomy.  相似文献   

8.
Nakajima T 《Bio Systems》2012,108(1-3):34-44
Epistatic interactions between genes in the genome constrain the accessible evolutionary paths of lineages. Two factors involving epistasis that can affect the evolutionary path and fate of lineages were investigated. The first factor concerns the impact of competition with another species lineage that has different epistatic constraints. Five enteric bacterial populations were evolved by point mutation in medium containing a single limiting resource. Single-species and two-species cultures were used to determine whether different asexual lineages have different capacities for producing variants due to epistatic constraints, and whether their survival is determined by local inter-lineage competition with different species. Local inter-lineage competition quickly resulted in one successful lineage, with another lineage becoming extinct before finding a higher peak. The second factor concerns a peak-shifting process, and whether the sexual recombination between different demes can cause peak shifts was investigated. An Escherichia coli population consisting of a male (Hfr) and female strain (F(-)) was evolved in a single limiting resource and compared to evolving populations containing the male or female strain alone. The E. coli sexual lineage was successful due to its ability to escape lower peaks and reach a higher peak, not because of a rapid approach to the nearest local peak the male or female asexual lineage could reach. The data in this study demonstrate that lineage survivability can be determined by the ability to produce beneficial mutations and checked by local competition between lineages of different species. Interspecific competition may prevent a population from evolving through crossing fitness valleys or adaptive ridges if it requires many generations to achieve peak shifts. The data also show that genomic recombination between different conspecific lineages can rapidly carry the combined lineage to a higher peak.  相似文献   

9.
An integrated physical and genetic map of the rice genome   总被引:12,自引:0,他引:12       下载免费PDF全文
Rice was chosen as a model organism for genome sequencing because of its economic importance, small genome size, and syntenic relationship with other cereal species. We have constructed a bacterial artificial chromosome fingerprint–based physical map of the rice genome to facilitate the whole-genome sequencing of rice. Most of the rice genome (~90.6%) was anchored genetically by overgo hybridization, DNA gel blot hybridization, and in silico anchoring. Genome sequencing data also were integrated into the rice physical map. Comparison of the genetic and physical maps reveals that recombination is suppressed severely in centromeric regions as well as on the short arms of chromosomes 4 and 10. This integrated high-resolution physical map of the rice genome will greatly facilitate whole-genome sequencing by helping to identify a minimum tiling path of clones to sequence. Furthermore, the physical map will aid map-based cloning of agronomically important genes and will provide an important tool for the comparative analysis of grass genomes.  相似文献   

10.
Genetic exchange between isolated populations, or introgression between species, serves as a key source of novel genetic material on which natural selection can act. While detecting historical gene flow from DNA sequence data is of much interest, many existing methods can be limited by requirements for deep population genomic sampling. In this paper, we develop a scalable genealogy-based method to detect candidate signatures of gene flow into a given population when the source of the alleles is unknown. Our method does not require sequenced samples from the source population, provided that the alleles have not reached fixation in the sampled recipient population. The method utilizes recent advances in algorithms for the efficient reconstruction of ancestral recombination graphs, which encode genealogical histories of DNA sequence data at each site, and is capable of detecting the signatures of gene flow whose footprints are of length up to single genes. Further, we employ a theoretical framework based on coalescent theory to test for statistical significance of certain recombination patterns consistent with gene flow from divergent sources. Implementing these methods for application to whole-genome sequences of environmental yeast isolates, we illustrate the power of our approach to highlight loci with unusual recombination histories. By developing innovative theory and methods to analyze signatures of gene flow from population sequence data, our work establishes a foundation for the continued study of introgression and its evolutionary relevance.  相似文献   

11.
Intrachromosomal homologous recombination in whole plants.   总被引:22,自引:2,他引:20       下载免费PDF全文
P Swoboda  S Gal  B Hohn    H Puchta 《The EMBO journal》1994,13(2):484-489
A system to assay intrachromosomal homologous recombination during the complete life-cycle of a whole higher eukaryote was set up. Arabidopsis thaliana plants were transformed with a recombination substrate carrying a non-selectable and quantitatively detectable marker gene. The recombination substrates contain two overlapping, non-functional deletion mutants of a chimeric beta-glucuronidase (uidA) gene. Upon recombination, as proven by Southern blot analysis, a functional gene is restored and its product can be detected by histochemical staining. Therefore, cells in which recombination events occurred, and their progeny, can be precisely localized in the whole plant. Recombination was observed in all plant organs examined, from the seed stage until the flowering stage of somatic plant development. Meristematic recombination events revealed cell lineage patterns. Overall recombination frequencies typically were in the range 10(-6)-10(-7) events/genome. Recombination frequencies were found to differ in different organs of particular transgenic lines.  相似文献   

12.
Recombination is an engine of genetic diversity and therefore constitutes a key process in evolutionary biology and genetics. While the outcome of crossover recombination can readily be detected as shuffled alleles by following the inheritance of markers in pedigreed families, the more precise location of both crossover and non-crossover recombination events has been difficult to pinpoint. As a consequence, we lack a detailed portrait of the recombination landscape for most organisms and knowledge on how this landscape impacts on sequence evolution at a local scale. To localize recombination events with high resolution in an avian system, we performed whole-genome re-sequencing at high coverage of a complete three-generation collared flycatcher pedigree. We identified 325 crossovers at a median resolution of 1.4 kb, with 86% of the events localized to <10 kb intervals. Observed crossover rates were in excellent agreement with data from linkage mapping, were 52% higher in male (3.56 cM/Mb) than in female meiosis (2.28 cM/Mb), and increased towards chromosome ends in male but not female meiosis. Crossover events were non-randomly distributed in the genome with several distinct hot-spots and a concentration to genic regions, with the highest density in promoters and CpG islands. We further identified 267 non-crossovers, whose location was significantly associated with crossover locations. We detected a significant transmission bias (0.18) in favour of ‘strong’ (G, C) over ‘weak’ (A, T) alleles at non-crossover events, providing direct evidence for the process of GC-biased gene conversion in an avian system. The approach taken in this study should be applicable to any species and would thereby help to provide a more comprehensive portray of the recombination landscape across organism groups.  相似文献   

13.
14.
Mycobacterium tuberculosis, the causative agent of most human tuberculosis, infects one third of the world's population and kills an estimated 1.7 million people a year. With the world-wide emergence of drug resistance, and the finding of more functional genetic diversity than previously expected, there is a renewed interest in understanding the forces driving genome evolution of this important pathogen. Genetic diversity in M. tuberculosis is dominated by single nucleotide polymorphisms and small scale gene deletion, with little or no evidence for large scale genome rearrangements seen in other bacteria. Recently, a single report described a large scale genome duplication that was suggested to be specific to the Beijing lineage. We report here multiple independent large-scale duplications of the same genomic region of M. tuberculosis detected through whole-genome sequencing. The duplications occur in strains belonging to both M. tuberculosis lineage 2 and 4, and are thus not limited to Beijing strains. The duplications occur in both drug-resistant and drug susceptible strains. The duplicated regions also have substantially different boundaries in different strains, indicating different originating duplication events. We further identify a smaller segmental duplication of a different genomic region of a lab strain of H37Rv. The presence of multiple independent duplications of the same genomic region suggests either instability in this region, a selective advantage conferred by the duplication, or both. The identified duplications suggest that large-scale gene duplication may be more common in M. tuberculosis than previously considered.  相似文献   

15.
Francisella noatunensis subsp. orientalis (FNO) is an important emerging pathogen associated with disease outbreaks in farm-raised Nile tilapia. FNO genetic diversity using PCR-based typing, no intra-species discrimination was achieved among isolates/strains from different countries, thus demonstrating a clonal behaviour pattern. In this study, we aimed to evaluate the population structure of FNO isolates by comparing whole-genome sequencing data. The analysis of recombination showed that Brazilian isolates group formed a clonal population; whereas other lineages are also supported by this analysis for isolates from foreign countries. The whole-genome multilocus sequence typing (wgMLST) analysis showed varying numbers of dissimilar alleles, suggesting that the Brazilian clonal population are in expansion. Each Brazilian isolate could be identified as a single node by high-resolution gene-by-gene approach, presenting slight genetic differences associated to mutational events. The common ancestry node suggests a single entry into the country before 2012, and the rapid dissemination of this infectious agent may be linked to market sales of infected fingerlings.  相似文献   

16.
Asexual bacterial populations inevitably consist of an assemblage of distinct clonal lineages. However, bacterial populations are not entirely asexual since recombinational exchanges occur, mobilizing small genome segments among lineages and species. The relative contribution of recombination, as opposed to de novo mutation, in the generation of new bacterial genotypes varies among bacterial populations and, as this contribution increases, the clonality of a given population decreases. In consequence, a spectrum of possible population structures exists, with few bacterial species occupying the extremes of highly clonal and completely non-clonal, most containing both clonal and non-clonal elements. The analysis of collections of bacterial isolates, which accurately represent the natural population, by nucleotide sequence determination of multiple housekeeping loci provides data that can be used both to investigate the population structure of bacterial pathogens and for the molecular characterization of bacterial isolates. Understanding the population structure of a given pathogen is important since it impacts on the questions that can be addressed by, and the methods and samples required for, effective molecular epidemiological studies.  相似文献   

17.
Crassostrea gigas is a model mollusk, but its genetic features have not been studied comprehensively. In this study, we used whole-genome resequencing data to identify and characterize nucleotide diversity and population recombination rate in a diverse collection of 21 C. gigas samples. Our analyses revealed that C. gigas harbors both extremely high genetic diversity and recombination rates across the whole genome as compared with those of the other taxa. The noncoding regions, introns, intergenic spacers, and untranslated regions (UTRs) showed a lower level diversity than the synonymous sites. The larger introns tended to have lower diversity. Moreover, we found a negative association of the non-synonymous diversity with gene expression, which suggested that purifying selection played an important role in shaping genetic diversity. The nucleotide diversity at the 100- and 50-kb levels was positively correlated with population recombination rates, which was expected if the diversity was shaped by purifying selection or hitchhiking of advantageous mutants. Our work gives a general picture of the oyster’s polymorphism pattern and its association with recombination rates.  相似文献   

18.
Some statistical properties of samples of DNA sequences are studied under an infinite-site neutral model with recombination. The two quantities of interest are R, the number of recombination events in the history of a sample of sequences, and RM, the number of recombination events that can be parsimoniously inferred from a sample of sequences. Formulas are derived for the mean and variance of R. In contrast to R, RM can be determined from the sample. Since no formulas are known for the mean and variance of RM, they are estimated with Monte Carlo simulations. It is found that RM is often much less than R, therefore, the number of recombination events may be greatly under-estimated in a parsimonious reconstruction of the history of a sample. The statistic RM can be used to estimate the product of the recombination rate and the population size or, if the recombination rate is known, to estimate the population size. To illustrate this, DNA sequences from the Adh region of Drosophila melanogaster are used to estimate the effective population size of this species.  相似文献   

19.
Due to genetic variation in the ancestor of two populations or two species, the divergence time for DNA sequences from two populations is variable along the genome. Within genomic segments all bases will share the same divergence-because they share a most recent common ancestor-when no recombination event has occurred to split them apart. The size of these segments of constant divergence depends on the recombination rate, but also on the speciation time, the effective population size of the ancestral population, as well as demographic effects and selection. Thus, inference of these parameters may be possible if we can decode the divergence times along a genomic alignment. Here, we present a new hidden Markov model that infers the changing divergence (coalescence) times along the genome alignment using a coalescent framework, in order to estimate the speciation time, the recombination rate, and the ancestral effective population size. The model is efficient enough to allow inference on whole-genome data sets. We first investigate the power and consistency of the model with coalescent simulations and then apply it to the whole-genome sequences of the two orangutan sub-species, Bornean (P. p. pygmaeus) and Sumatran (P. p. abelii) orangutans from the Orangutan Genome Project. We estimate the speciation time between the two sub-species to be thousand years ago and the effective population size of the ancestral orangutan species to be , consistent with recent results based on smaller data sets. We also report a negative correlation between chromosome size and ancestral effective population size, which we interpret as a signature of recombination increasing the efficacy of selection.  相似文献   

20.
The advent of next‐generation sequencing (NGS) has dramatically changed bacterial typing technologies, increasing our ability to differentiate bacterial isolates. Despite it is now possible to sequence a bacterial genome in a few days and at reasonable costs, most genetic analyses do not require whole‐genome sequencing, which also remains impractical for large population samples due to the cost of individual library preparation and bioinformatics. More traditional sequencing approaches, however, such as MultiLocus Sequence Typing (mlst ) are quite laborious and time‐consuming, especially for large‐scale analyses. In this study, a genotyping approach based on restriction site‐associated (RAD) tag sequencing, 2b‐RAD, was applied to characterize Listeria monocytogenes strains. To verify the feasibility of the method, an in silico analysis was performed on 30 available complete genomes. For the same set of strains, in silico mlst analysis was conducted as well. Subsequently, 2b‐RAD and mlst analyses were experimentally carried out on 58 isolates collected from food samples or food‐processing sites. The obtained results demonstrate that 2b‐RAD predicts mlst types and often provides more detailed information on population structure than mlst . Moreover, the majority of variants differentiating identical sequence type isolates mapped against accessory fragments, thus providing additional information to characterize strains. Although mlst still represents a reliable typing method, large‐scale studies on molecular epidemiology and public health, as well as bacterial phylogenetics, population genetics and biosafety could benefit of a low cost and fast turnaround time approach such as the 2b‐RAD analysis proposed here.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号