首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
Whole-genome resequencing (WGR) is a high-throughput way to determine genomic variations in breeding-related research. Accuracy and sensitivity are two of the most important issues in variation calling of WGR, especially for samples with low-depth resequencing data, which are used to reduce cost and save time in studies as survey of core germplasms from natural populations or genome-based breeding selection in segregation populations. An approach called pooled mapping was developed to call variations from low-depth resequencing data of natural or segregation populations. It is highly accurate and sensitive. First, pooled mapping creates a library of confident polymorphic loci in genomes of the population; then, the genotypes are called out at these confident loci for each sample in an efficient manner. The reliability of this pooled mapping method was confirmed using simulated datasets, real resequencing data and experimental genotyping. With onefold simulated resequencing data, results showed that pooled mapping identified SNPs in high accuracy (99.59 %) and sensitivity (93 %), compared to the commonly used method (accuracy: 29 %; sensitivity: 56 %). For the real low-depth resequencing data (≈0.8×) of 281 B. oleracea accessions, four loci corresponding to 1063 sites were selected for KASP genotyping to confirm the performance of pooled mapping. We found for all the 875 homozygous sites analyzed, pooled mapping achieved accuracy as 98.24 % and a sensitivity as 90.97 %. In conclusion, pooled mapping is an efficient means of determining reliable genomic variations with limited resequencing data for population samples. It will be a valuable tool in population genomic analysis and genome-based breeding research.  相似文献   

2.
3.
Detecting positive selection using genomic data is critical to understanding the role of adaptive evolution. Of particular interest in this context is sex chromosomes since they are thought to play a special role in local adaptation and speciation. We sought to circumvent the challenges associated with statistical phasing when using haplotype‐based statistics in sweep scans by benefitting from that whole chromosome haplotypes of the sex chromosomes can be obtained by resequencing of individuals of the hemizygous sex. We analyzed whole Z chromosome haplotypes from 100 females from several populations of four black and white flycatcher species (in birds, females are ZW and males ZZ). Based on integrated haplotype score (iHS) and number of segregating sites by length (nSL) statistics, we found strong and frequent haplotype structure in several regions of the Z chromosome in each species. Most of these sweep signals were population‐specific, with essentially no evidence for regions under selection shared among species. Some completed sweeps were revealed by the cross‐population extended haplotype homozygosity (XP‐EHH) statistic. Importantly, by using statistically phased Z chromosome data from resequencing of males, we failed to recover the signals of selection detected in analyses based on whole chromosome haplotypes from females; instead, what likely represent false signals of selection were frequently seen. This highlights the power issues in statistical phasing and cautions against conclusions from selection scans using such data. The detection of frequent selective sweeps on the avian Z chromosome supports a large role of sex chromosomes in adaptive evolution.  相似文献   

4.
Whole genome sequences (WGS) greatly increase our ability to precisely infer population genetic parameters, demographic processes, and selection signatures. However, WGS may still be not affordable for a representative number of individuals/populations. In this context, our goal was to assess the efficiency of several SNP genotyping strategies by testing their ability to accurately estimate parameters describing neutral diversity and to detect signatures of selection. We analysed 110 WGS at 12× coverage for four different species, i.e., sheep, goats and their wild counterparts. From these data we generated 946 data sets corresponding to random panels of 1K to 5M variants, commercial SNP chips and exome capture, for sample sizes of five to 48 individuals. We also extracted low‐coverage genome resequencing of 1×, 2× and 5× by randomly subsampling reads from the 12× resequencing data. Globally, 5K to 10K random variants were enough for an accurate estimation of genome diversity. Conversely, commercial panels and exome capture displayed strong ascertainment biases. Besides the characterization of neutral diversity, the detection of the signature of selection and the accurate estimation of linkage disequilibrium (LD) required high‐density panels of at least 1M variants. Finally, genotype likelihoods increased the quality of variant calling from low coverage resequencing but proportions of incorrect genotypes remained substantial, especially for heterozygote sites. Whole genome resequencing coverage of at least 5× appeared to be necessary for accurate assessment of genomic variations. These results have implications for studies seeking to deploy low‐density SNP collections or genome scans across genetically diverse populations/species showing similar genetic characteristics and patterns of LD decay for a wide variety of purposes.  相似文献   

5.
MOTIVATION: Next-generation targeted resequencing of genome-wide association study (GWAS)-associated genomic regions is a common approach for follow-up of indirect association of common alleles. However, it is prohibitively expensive to sequence all the samples from a well-powered GWAS study with sufficient depth of coverage to accurately call rare genotypes. As a result, many studies may use next-generation sequencing for single nucleotide polymorphism (SNP) discovery in a smaller number of samples, with the intent to genotype candidate SNPs with rare alleles captured by resequencing. This approach is reasonable, but may be inefficient for rare alleles if samples are not carefully selected for the resequencing experiment. RESULTS: We have developed a probability-based approach, SampleSeq, to select samples for a targeted resequencing experiment that increases the yield of rare disease alleles substantially over random sampling of cases or controls or sampling based on genotypes at associated SNPs from GWAS data. This technique allows for smaller sample sizes for resequencing experiments, or allows the capture of rarer risk alleles. When following up multiple regions, SampleSeq selects subjects with an even representation of all the regions. SampleSeq also can be used to calculate the sample size needed for the resequencing to increase the chance of successful capture of rare alleles of desired frequencies. SOFTWARE: http://biostat.mc.vanderbilt.edu/SampleSeq  相似文献   

6.
To investigate the extent to which the proportion of schizophrenia’s additive genetic variation tagged by SNPs is shared by populations of European and African descent, we analyzed the largest combined African descent (AD [n = 2,142]) and European descent (ED [n = 4,990]) schizophrenia case-control genome-wide association study (GWAS) data set available, the Molecular Genetics of Schizophrenia (MGS) data set. We show how a method that uses genomic similarities at measured SNPs to estimate the additive genetic correlation (SNP correlation [SNP-rg]) between traits can be extended to estimate SNP-rg for the same trait between ethnicities. We estimated SNP-rg for schizophrenia between the MGS ED and MGS AD samples to be 0.66 (SE = 0.23), which is significantly different from 0 (p(SNP-rg = 0) = 0.0003), but not 1 (p(SNP-rg = 1) = 0.26). We re-estimated SNP-rg between an independent ED data set (n = 6,665) and the MGS AD sample to be 0.61 (SE = 0.21, p(SNP-rg = 0) = 0.0003, p(SNP-rg = 1) = 0.16). These results suggest that many schizophrenia risk alleles are shared across ethnic groups and predate African-European divergence.  相似文献   

7.
Omega-3 and omega-6 long-chain polyunsaturated fatty acids (LC-PUFAs) are essential for the development and function of the human brain. They can be obtained directly from food, e.g., fish, or synthesized from precursor molecules found in vegetable oils. To determine the importance of genetic variability to fatty-acid biosynthesis, we studied FADS1 and FADS2, which encode rate-limiting enzymes for fatty-acid conversion. We performed genome-wide genotyping (n = 5,652 individuals) and targeted resequencing (n = 960 individuals) of the FADS region in five European population cohorts. We also analyzed available genomic data from human populations, archaic hominins, and more distant primates. Our results show that present-day humans have two common FADS haplotypes-defined by 28 closely linked SNPs across 38.9 kb-that differ dramatically in their ability to generate LC-PUFAs. No independent effects on FADS activity were seen for rare SNPs detected by targeted resequencing. The more efficient, evolutionarily derived haplotype appeared after the lineage split leading to modern humans and Neanderthals and shows evidence of positive selection. This human-specific haplotype increases the efficiency of synthesizing essential long-chain fatty acids from precursors and thereby might have provided an advantage in environments with limited access to dietary LC-PUFAs. In the modern world, this haplotype has been associated with lifestyle-related diseases, such as coronary artery disease.  相似文献   

8.
In order to assess the efficiency of male gametophytic selection (MGS) for crop improvement, pollen selection for tolerance to herbicide was applied in maize. The experiment was designed to test the parallel reactivity to Alachlor of pollen and plants grown in controlled conditions or in the field, the response to pollen selection in the sporophytic progeny, the response to a second cycle of MGS, and the transmission of the selected trait to the following generations. The results demonstrated that pollen assay can be used to predict Alachlor tolerance under field conditions and to monitor the response to selection. A positive response to selection applied to pollen in the sporophytic progeny was obtained in diverse genetic backgrounds, indicating that the technique can be generally included in standard breeding programs; the analysis of the data produced in a second selection cycle indicated that the selected trait is maintained in the next generation.  相似文献   

9.
Viral nervous necrosis disease (VNN), caused by nervous necrosis virus (NNV), is one major threat to mariculture. Identifying loci and understanding the mechanisms associated with resistance to VNN are important in selective breeding programs. We performed a genome-wide association study (GWAS) using genotyping-by-sequencing (GBS) to study the genomic architecture of resistance to NNV infection in Asian seabass. We genotyped 986 individuals from 43 families produced by 15 founders with 44498 bi-allelic genetic variants using GBS. The GWAS identified three genome-wide significant loci on chromosomes 16, 19, and 20, respectively, and six suggestive loci on chromosomes 1, 8, 14, 15, 21, and 24, respectively, associated with resistance to NNV infection measured as binary and quantitative traits. Using the 500 most significant markers in combination with a training population of 800 samples could reach a genomic prediction accuracy of 0.7. Candidate genes significantly associated with resistance to NNV, including lysine-specific demethylase 2A, beta-defensin 1, and cystatin-B, which play important roles in immune responses against virus infection, were identified. Almost all the candidate genes were differentially expressed in different tissues against NNV infection. The significant genetic variants can be used in genomic selection and help understand the mechanism of resistance to VNN. Future studies should use populations of large effective size and whole genome resequencing to identify more useful genetic variants.  相似文献   

10.
Adaptation is driven by natural selection; however, many adaptations are caused by weak selection acting over large timescales, complicating its study. Therefore, it is rarely possible to study selection comprehensively in natural environments. The threespine stickleback (Gasterosteus aculeatus) is a well-studied model organism with a short generation time, small genome size, and many genetic and genomic tools available. Within this originally marine species, populations have recurrently adapted to freshwater all over its range. This evolution involved extensive parallelism: pre-existing alleles that adapt sticklebacks to freshwater habitats, but are also present at low frequencies in marine populations, have been recruited repeatedly. While a number of genomic regions responsible for this adaptation have been identified, the details of selection remain poorly understood. Using whole-genome resequencing, we compare pooled genomic samples from marine and freshwater populations of the White Sea basin, and identify 19 short genomic regions that are highly divergent between them, including three known inversions. 17 of these regions overlap protein-coding genes, including a number of genes with predicted functions that are relevant for adaptation to the freshwater environment. We then analyze four additional independently derived young freshwater populations of known ages, two natural and two artificially established, and use the observed shifts of allelic frequencies to estimate the strength of positive selection. Adaptation turns out to be quite rapid, indicating strong selection acting simultaneously at multiple regions of the genome, with selection coefficients of up to 0.27. High divergence between marine and freshwater genotypes, lack of reduction in polymorphism in regions responsible for adaptation, and high frequencies of freshwater alleles observed even in young freshwater populations are all consistent with rapid assembly of G. aculeatus freshwater genotypes from pre-existing genomic regions of adaptive variation, with strong selection that favors this assembly acting simultaneously at multiple loci.  相似文献   

11.
We used custom-designed resequencing arrays to generate 3.1 Mb of genomic sequence from a panel of 56 Bacillus anthracis strains. Sequence quality was shown to be very high by replication (discrepancy rate of 7.4 × 10-7) and by comparison to independently generated shotgun sequence (discrepancy rate < 2.5 × 10-6). Population genomics studies of microbial pathogens using rapid resequencing technologies such as resequencing arrays are critical for recognizing newly emerging or genetically engineered strains.  相似文献   

12.
The identification of genes influencing fitness is central to our understanding of the genetic basis of adaptation and how it shapes phenotypic variation in wild populations. Here, we used whole‐genome resequencing of wild Rocky Mountain bighorn sheep (Ovis canadensis) to >50‐fold coverage to identify 2.8 million single nucleotide polymorphisms (SNPs) and genomic regions bearing signatures of directional selection (i.e. selective sweeps). A comparison of SNP diversity between the X chromosome and the autosomes indicated that bighorn males had a dramatically reduced long‐term effective population size compared to females. This probably reflects a long history of intense sexual selection mediated by male–male competition for mates. Selective sweep scans based on heterozygosity and nucleotide diversity revealed evidence for a selective sweep shared across multiple populations at RXFP2, a gene that strongly affects horn size in domestic ungulates. The massive horns carried by bighorn rams appear to have evolved in part via strong positive selection at RXFP2. We identified evidence for selection within individual populations at genes affecting early body growth and cellular response to hypoxia; however, these must be interpreted more cautiously as genetic drift is strong within local populations and may have caused false positives. These results represent a rare example of strong genomic signatures of selection identified at genes with known function in wild populations of a nonmodel species. Our results also showcase the value of reference genome assemblies from agricultural or model species for studies of the genomic basis of adaptation in closely related wild taxa.  相似文献   

13.

Background

Ultra high throughput sequencing (UHTS) technologies find an important application in targeted resequencing of candidate genes or of genomic intervals from genetic association studies. Despite the extraordinary power of these new methods, they are still rarely used in routine analysis of human genomic variants, in part because of the absence of specific standard procedures. The aim of this work is to provide human molecular geneticists with a tool to evaluate the best UHTS methodology for efficiently detecting DNA changes, from common SNPs to rare mutations.

Methodology/Principal Findings

We tested the three most widespread UHTS platforms (Roche/454 GS FLX Titanium, Illumina/Solexa Genome Analyzer II and Applied Biosystems/SOLiD System 3) on a well-studied region of the human genome containing many polymorphisms and a very rare heterozygous mutation located within an intronic repetitive DNA element. We identify the qualities and the limitations of each platform and describe some peculiarities of UHTS in resequencing projects.

Conclusions/Significance

When appropriate filtering and mapping procedures are applied UHTS technology can be safely and efficiently used as a tool for targeted human DNA variations detection. Unless particular and platform-dependent characteristics are needed for specific projects, the most relevant parameter to consider in mainstream human genome resequencing procedures is the cost per sequenced base-pair associated to each machine.  相似文献   

14.
Identification of rare variants by resequencing is important both for detecting novel variations and for screening individuals for known disease alleles. New technologies enable low-cost resequencing of target regions, although it is still prohibitive to test more than a few individuals. We propose a novel pooling design that enables the recovery of novel or known rare alleles and their carriers in groups of individuals. The method is based on a Compressed Sensing (CS) approach, which is general, simple and efficient. CS allows the use of generic algorithmic tools for simultaneous identification of multiple variants and their carriers. We model the experimental procedure and show via computer simulations that it enables the recovery of rare alleles and their carriers in larger groups than were possible before. Our approach can also be combined with barcoding techniques to provide a feasible solution based on current resequencing costs. For example, when targeting a small enough genomic region (∼100 bp) and using only ∼10 sequencing lanes and ∼10 distinct barcodes per lane, one recovers the identity of 4 rare allele carriers out of a population of over 4000 individuals. We demonstrate the performance of our approach over several publicly available experimental data sets.  相似文献   

15.
A major question in evolutionary biology is how natural selection has shaped patterns of genetic variation across the human genome. Previous work has documented a reduction in genetic diversity in regions of the genome with low recombination rates. However, it is unclear whether other summaries of genetic variation, like allele frequencies, are also correlated with recombination rate and whether these correlations can be explained solely by negative selection against deleterious mutations or whether positive selection acting on favorable alleles is also required. Here we attempt to address these questions by analyzing three different genome-wide resequencing datasets from European individuals. We document several significant correlations between different genomic features. In particular, we find that average minor allele frequency and diversity are reduced in regions of low recombination and that human diversity, human-chimp divergence, and average minor allele frequency are reduced near genes. Population genetic simulations show that either positive natural selection acting on favorable mutations or negative natural selection acting against deleterious mutations can explain these correlations. However, models with strong positive selection on nonsynonymous mutations and little negative selection predict a stronger negative correlation between neutral diversity and nonsynonymous divergence than observed in the actual data, supporting the importance of negative, rather than positive, selection throughout the genome. Further, we show that the widespread presence of weakly deleterious alleles, rather than a small number of strongly positively selected mutations, is responsible for the correlation between neutral genetic diversity and recombination rate. This work suggests that natural selection has affected multiple aspects of linked neutral variation throughout the human genome and that positive selection is not required to explain these observations.  相似文献   

16.
RecQ DNA helicases from many organisms have been indicated to function in the maintenance of genomic stability. In human cells, mutation in the WRN helicase, a RecQ-like DNA helicase, results in the Werner syndrome (WS), a genetic disorder characterized by genomic instability and premature ageing. Similarly, mutation in SGS1, the RECQ homologue in budding yeast, results in genomic instability and accelerated ageing. We previously demonstrated that mouse WRN interacts physically with a novel, highly conserved protein that we named WHIP, and that in budding yeast cells, simultaneous deletion of WHIP/MGS1 and SGS1 results in slow growth and shortened life span. Here we show by using genetic analysis in Saccharomyces cerevisiae that mgs1Delta sgs1Delta cells have increased rates of terminal G2/M arrest, and show elevated rates of spontaneous sister chromatid recombination (SCR) and rDNA array recombination. Finally, we report that complementation of the synthetic relationship between SGS1 and WHIP/MGS1 requires both the helicase and Top3-binding activities of Sgs1, as well as the ATPase activity of Mgs1. Our results suggest that Whip/Mgs1 is implicated in DNA metabolism, and is required for normal growth and cell cycle progression in the absence of Sgs1.  相似文献   

17.
18.
Sex allocation theory predicts that the optimal sexual resource allocation of simultaneous hermaphrodites is affected by mating group size (MGS). Although the original concept assumes that the MGS does not differ between male and female functions, the MGS in the male function (MGSm; i.e., the number of sperm recipients the focal individual can deliver its sperm to plus one) and that in the female function (MGSf; the number of sperm donors plus one) do not always coincide and may differently affect the optimal sex allocation. Moreover, reproductive costs can be split into “variable” (e.g., sperm and eggs) and “fixed” (e.g., genitalia) costs, but these have been seldom distinguished in empirical studies. We examined the effects of MGSm and MGSf on the fixed and variable reproductive investments in the sessilian barnacle Balanus rostratus. The results showed that MGSm had a positive effect on sex allocation, whereas MGSf had a nearly significant negative effect. Moreover, the “fixed” cost varied with body size and both aspects of MGS. We argue that the two aspects of MGS should be distinguished for organisms with unilateral mating.  相似文献   

19.
Delineating microbial populations, discovering ecologically relevant phenotypes and identifying migrants, hybrids or admixed individuals have long proved notoriously difficult, thereby limiting our understanding of the evolutionary forces at play during the diversification of microbial species. However, recent advances in sequencing and computational methods have enabled an unbiased approach whereby incipient species and the genetic correlates of speciation can be identified by examining patterns of genomic variation within and between lineages. We present here a population genomic study of a phylogenetic species in the Neurospora discreta species complex, based on the resequencing of full genomes (~37 Mb) for 52 fungal isolates from nine sites in three continents. Population structure analyses revealed two distinct lineages in South–East Asia, and three lineages in North America/Europe with a broad longitudinal and latitudinal range and limited admixture between lineages. Genome scans for selective sweeps and comparisons of the genomic landscapes of diversity and recombination provided no support for a role of selection at linked sites on genomic heterogeneity in levels of divergence between lineages. However, demographic inference indicated that the observed genomic heterogeneity in divergence was generated by varying rates of gene flow between lineages following a period of isolation. Many putative cases of exchange of genetic material between phylogenetically divergent fungal lineages have been discovered, and our work highlights the quantitative importance of genetic exchanges between more closely related taxa to the evolution of fungal genomes. Our study also supports the role of allopatric isolation as a driver of diversification in saprobic microbes.  相似文献   

20.
The majority of agronomically important crop traits are quantitative, meaning that they are controlled by multiple genes each with a small effect (quantitative trait loci, QTLs). Mapping and isolation of QTLs is important for efficient crop breeding by marker‐assisted selection (MAS) and for a better understanding of the molecular mechanisms underlying the traits. However, since it requires the development and selection of DNA markers for linkage analysis, QTL analysis has been time‐consuming and labor‐intensive. Here we report the rapid identification of plant QTLs by whole‐genome resequencing of DNAs from two populations each composed of 20–50 individuals showing extreme opposite trait values for a given phenotype in a segregating progeny. We propose to name this approach QTL‐seq as applied to plant species. We applied QTL‐seq to rice recombinant inbred lines and F2 populations and successfully identified QTLs for important agronomic traits, such as partial resistance to the fungal rice blast disease and seedling vigor. Simulation study showed that QTL‐seq is able to detect QTLs over wide ranges of experimental variables, and the method can be generally applied in population genomics studies to rapidly identify genomic regions that underwent artificial or natural selective sweeps.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号