首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Mitochondrial DNA (mtDNA) has formed the backbone of phylogeographic research for many years; however, recent trends focus on genome‐wide analyses. One method proposed for calibrating inferences from noisy next‐generation data, such as RAD sequencing, is to compare these results with analyses of mitochondrial sequences. Most researchers using this approach appear to be unaware that many single nucleotide polymorphisms (SNPs) identified from genome‐wide sequence data are themselves mitochondrial, or assume that these are too few to bias analyses. Here, we demonstrate two methods for mining mitochondrial markers using RAD sequence data from three South African species of yellowfish, Labeobarbus. First, we use a rigorous SNP discovery pipeline using the program stacks , to identify variant sites in mtDNA, which we then combine into haplotypes. Second, we directly map sequence reads against a mitochondrial genome reference. This method allowed us to reconstruct up to 98% of the Labeobarbus mitogenome. We validated these mitogenome reconstructions through blast database searches and by comparison with cytochrome b gene sequences obtained through Sanger sequencing. Finally, we investigate the organismal consequences of these data including ancient genetic exchange and a recent translocation among populations of L. natalensis, as well as interspecific hybridization between L. aeneus and L. kimberleyensis.  相似文献   

2.
Arctic freshwater ecosystems have been profoundly affected by climate change. Given that the Arctic charr (Salvelinus alpinus) is often the only fish species inhabiting these ecosystems, it represents a valuable model for studying the impacts of climate change on species life‐history diversity and adaptability. Using a genotyping‐by‐sequencing approach, we identified 5,976 neutral single nucleotide polymorphisms and found evidence for reduced gene flow between allopatric morphs from two high Arctic lakes, Linne'vatn (Anadromous, Normal, and Dwarf) and Ellasjøen (Littoral and Pelagic). Within each lake, the degree of genetic differentiation ranged from low (Pelagic vs. Littoral) to moderate (Anadromous and Normal vs. Dwarf). We identified 17 highly diagnostic, putatively adaptive SNPs that differentiated the allopatric morphs. Although we found no evidence for adaptive differences between morphs within Ellasjøen, we found evidence for moderate (Anadromous vs. Normal) to high genetic differentiation (Anadromous and Normal vs. Dwarf) among morphs within Linne'vatn based on two adaptive loci. As these freshwater ecosystems become more productive, the frequency of sympatric morphs in Ellasjøen will likely shift based on foraging opportunities, whereas the propensity to migrate may decrease in Linne'vatn, increasing the frequency of the Normal morph. The Dwarf charr was the most genetically distinct group. Identifying the biological basis for small body size should elucidate the potential for increased growth and subsequent interbreeding with sympatric morphs. Overall, neutral and adaptive genomic differentiation between allopatric and some sympatric morphs suggests that the response of Arctic charr to climate change will be variable across freshwater ecosystems.  相似文献   

3.
The first North American RAD Sequencing and Genomics Symposium, sponsored by Floragenex (http://www.floragenex.com/radmeeting/), took place in Portland, Oregon (USA) on 19 April 2011. This symposium was convened to promote and discuss the use of restriction-site-associated DNA (RAD) sequencing technologies. RAD sequencing is one of several strategies recently developed to increase the power of data generated via short-read sequencing technologies by reducing their complexity (Baird et al. 2008; Huang et al. 2009; Andolfatto et al. 2011; Elshire et al. 2011). RAD sequencing, as a form of genotyping by sequencing, has been effectively applied in genetic mapping and quantitative trait loci (QTL) analyses in a range of organisms including nonmodel, genetically highly heterogeneous organisms (Table 1; Baird et al. 2008; Baxter et al. 2011; Chutimanitsakun et al. 2011; Pfender et al. 2011). RAD sequencing has recently found applications in phylogeography (Emerson et al. 2010) and population genomics (Hohenlohe et al. 2010). Considering the diversity of talks presented during this meeting, more developments are to be expected in the very near future.  相似文献   

4.
Whole genome sequences (WGS) greatly increase our ability to precisely infer population genetic parameters, demographic processes, and selection signatures. However, WGS may still be not affordable for a representative number of individuals/populations. In this context, our goal was to assess the efficiency of several SNP genotyping strategies by testing their ability to accurately estimate parameters describing neutral diversity and to detect signatures of selection. We analysed 110 WGS at 12× coverage for four different species, i.e., sheep, goats and their wild counterparts. From these data we generated 946 data sets corresponding to random panels of 1K to 5M variants, commercial SNP chips and exome capture, for sample sizes of five to 48 individuals. We also extracted low‐coverage genome resequencing of 1×, 2× and 5× by randomly subsampling reads from the 12× resequencing data. Globally, 5K to 10K random variants were enough for an accurate estimation of genome diversity. Conversely, commercial panels and exome capture displayed strong ascertainment biases. Besides the characterization of neutral diversity, the detection of the signature of selection and the accurate estimation of linkage disequilibrium (LD) required high‐density panels of at least 1M variants. Finally, genotype likelihoods increased the quality of variant calling from low coverage resequencing but proportions of incorrect genotypes remained substantial, especially for heterozygote sites. Whole genome resequencing coverage of at least 5× appeared to be necessary for accurate assessment of genomic variations. These results have implications for studies seeking to deploy low‐density SNP collections or genome scans across genetically diverse populations/species showing similar genetic characteristics and patterns of LD decay for a wide variety of purposes.  相似文献   

5.
Despite the increasing opportunity to collect large‐scale data sets for population genomic analyses, the use of high‐throughput sequencing to study populations of polyploids has seen little application. This is due in large part to problems associated with determining allele copy number in the genotypes of polyploid individuals (allelic dosage uncertainty–ADU), which complicates the calculation of important quantities such as allele frequencies. Here, we describe a statistical model to estimate biallelic SNP frequencies in a population of autopolyploids using high‐throughput sequencing data in the form of read counts. We bridge the gap from data collection (using restriction enzyme based techniques [e.g. GBS, RADseq]) to allele frequency estimation in a unified inferential framework using a hierarchical Bayesian model to sum over genotype uncertainty. Simulated data sets were generated under various conditions for tetraploid, hexaploid and octoploid populations to evaluate the model's performance and to help guide the collection of empirical data. We also provide an implementation of our model in the R package polyfreqs and demonstrate its use with two example analyses that investigate (i) levels of expected and observed heterozygosity and (ii) model adequacy. Our simulations show that the number of individuals sampled from a population has a greater impact on estimation error than sequencing coverage. The example analyses also show that our model and software can be used to make inferences beyond the estimation of allele frequencies for autopolyploids by providing assessments of model adequacy and estimates of heterozygosity.  相似文献   

6.
We are writing in response to the population and phylogenomics meeting review by Andrews & Luikart ( 2014 ) entitled ‘Recent novel approaches for population genomics data analysis’. Restriction‐site‐associated DNA (RAD) sequencing has become a powerful and useful approach in molecular ecology, with several different published methods now available to molecular ecologists, none of which can be considered the best option in all situations. A&L report that the original RAD protocol of Miller et al. ( 2007 ) and Baird et al. ( 2008 ) is superior to all other RAD variants because putative PCR duplicates can be identified (see Baxter et al. 2011 ), thereby reducing the impact of PCR artefacts on allele frequency estimates (Andrews & Luikart 2014 ). In response, we (i) challenge the assertion that the original RAD protocol minimizes the impact of PCR artefacts relative to that of other RAD protocols, (ii) present additional biases in RADseq that are at least as important as PCR artefacts in selecting a RAD protocol and (iii) highlight the strengths and weaknesses of four different approaches to RADseq which are a representative sample of all RAD variants: the original RAD protocol (mbRAD, Miller et al. 2007 ; Baird et al. 2008 ), double digest RAD (ddRAD, Peterson et al. 2012 ), ezRAD (Toonen et al. 2013 ) and 2bRAD (Wang et al. 2012 ). With an understanding of the strengths and weaknesses of different RAD protocols, researchers can make a more informed decision when selecting a RAD protocol.  相似文献   

7.
Reduced representation genome sequencing such as restriction‐site‐associated DNA (RAD) sequencing is finding increased use to identify and genotype large numbers of single‐nucleotide polymorphisms (SNPs) in model and nonmodel species. We generated a unique resource of novel SNP markers for the European eel using the RAD sequencing approach that was simultaneously identified and scored in a genome‐wide scan of 30 individuals. Whereas genomic resources are increasingly becoming available for this species, including the recent release of a draft genome, no genome‐wide set of SNP markers was available until now. The generated SNPs were widely distributed across the eel genome, aligning to 4779 different contigs and 19 703 different scaffolds. Significant variation was identified, with an average nucleotide diversity of 0.00529 across individuals. Results varied widely across the genome, ranging from 0.00048 to 0.00737 per locus. Based on the average nucleotide diversity across all loci, long‐term effective population size was estimated to range between 132 000 and 1 320 000, which is much higher than previous estimates based on microsatellite loci. The generated SNP resource consisting of 82 425 loci and 376 918 associated SNPs provides a valuable tool for future population genetics and genomics studies and allows for targeting specific genes and particularly interesting regions of the eel genome.  相似文献   

8.
Many plants and animals of polyploid origin are currently enjoying a genomics explosion enabled by modern sequencing and genotyping technologies. However, routine filtering of duplicated loci in most studies using genotyping by sequencing introduces an unacceptable, but often overlooked, bias when detecting selection. Retained duplicates from ancient whole‐genome duplications (WGDs) may be found throughout genomes, whereas retained duplicates from recent WGDs are concentrated at distal ends of some chromosome arms. Additionally, segmental duplicates can be found at distal ends or nearly anywhere in a genome. Evidence shows that these duplications facilitate adaptation through one of two pathways: neo‐functionalization or increased gene expression. Filtering duplicates removes distal ends of some chromosomes, and distal ends are especially known to harbour adaptively important genes. Thus, filtering of duplicated loci impoverishes the interpretation of genomic data as signals from contiguous duplicated genes are ignored. We review existing strategies to genotype and map duplicated loci; we focus in detail on an overlooked strategy of using gynogenetic haploids (1N) as a part of new genotyping by sequencing studies. We provide guidelines on how to use this haploid strategy for studies on polyploid‐origin vertebrates including how it can be used to screen duplicated loci in natural populations. We conclude by discussing areas of research that will benefit from better inclusion of polyploid loci; we particularly stress the sometimes overlooked fact that basing genomic studies on dense maps provides value added in the form of locating and annotating outlier loci or colocating outliers into islands of divergence.  相似文献   

9.
Research in evolutionary biology involving nonmodel organisms is rapidly shifting from using traditional molecular markers such as mtDNA and microsatellites to higher throughput SNP genotyping methodologies to address questions in population genetics, phylogenetics and genetic mapping. Restriction site associated DNA sequencing (RAD sequencing or RADseq) has become an established method for SNP genotyping on Illumina sequencing platforms. Here, we developed a protocol and adapters for double‐digest RAD sequencing for Ion Torrent (Life Technologies; Ion Proton, Ion PGM) semiconductor sequencing. We sequenced thirteen genomic libraries of three different nonmodel vertebrate species on Ion Proton with PI chips: Arctic charr Salvelinus alpinus, European whitefish Coregonus lavaretus and common lizard Zootoca vivipara. This resulted in ~962 million single‐end reads overall and a mean of ~74 million reads per library. We filtered the genomic data using Stacks, a bioinformatic tool to process RAD sequencing data. On average, we obtained ~11 000 polymorphic loci per library of 6–30 individuals. We validate our new method by technical and biological replication, by reconstructing phylogenetic relationships, and using a hybrid genetic cross to track genomic variants. Finally, we discuss the differences between using the different sequencing platforms in the context of RAD sequencing, assessing possible advantages and disadvantages. We show that our protocol can be used for Ion semiconductor sequencing platforms for the rapid and cost‐effective generation of variable and reproducible genetic markers.  相似文献   

10.
High‐throughput sequencing has revolutionized population and conservation genetics. RAD sequencing methods, such as 2b‐RAD, can be used on species lacking a reference genome. However, transferring protocols across taxa can potentially lead to poor results. We tested two different IIB enzymes (AlfI and CspCI) on two species with different genome sizes (the loggerhead turtle Caretta caretta and the sharpsnout seabream Diplodus puntazzo) to build a set of guidelines to improve 2b‐RAD protocols on non‐model organisms while optimising costs. Good results were obtained even with degraded samples, showing the value of 2b‐RAD in studies with poor DNA quality. However, library quality was found to be a critical parameter on the number of reads and loci obtained for genotyping. Resampling analyses with different number of reads per individual showed a trade‐off between number of loci and number of reads per sample. The resulting accumulation curves can be used as a tool to calculate the number of sequences per individual needed to reach a mean depth ≥20 reads to acquire good genotyping results. Finally, we demonstrated that selective‐base ligation does not affect genomic differentiation between individuals, indicating that this technique can be used in species with large genome sizes to adjust the number of loci to the study scope, to reduce sequencing costs and to maintain suitable sequencing depth for a reliable genotyping without compromising the results. Here, we provide a set of guidelines to improve 2b‐RAD protocols on non‐model organisms with different genome sizes, helping decision‐making for a reliable and cost‐effective genotyping.  相似文献   

11.
Previously we extended the utility of mapping‐by‐sequencing by combining it with sequence capture and mapping sequence data to pseudo‐chromosomes that were organized using wheat–Brachypodium synteny. This, with a bespoke haplotyping algorithm, enabled us to map the flowering time locus in the diploid wheat Triticum monococcum L. identifying a set of deleted genes (Gardiner et al., 2014). Here, we develop this combination of gene enrichment and sliding window mapping‐by‐synteny analysis to map the Yr6 locus for yellow stripe rust resistance in hexaploid wheat. A 110 MB NimbleGen capture probe set was used to enrich and sequence a doubled haploid mapping population of hexaploid wheat derived from an Avalon and Cadenza cross. The Yr6 locus was identified by mapping to the POPSEQ chromosomal pseudomolecules using a bespoke pipeline and algorithm (Chapman et al., 2015). Furthermore the same locus was identified using newly developed pseudo‐chromosome sequences as a mapping reference that are based on the genic sequence used for sequence enrichment. The pseudo‐chromosomes allow us to demonstrate the application of mapping‐by‐sequencing to even poorly defined polyploidy genomes where chromosomes are incomplete and sub‐genome assemblies are collapsed. This analysis uniquely enabled us to: compare wheat genome annotations; identify the Yr6 locus – defining a smaller genic region than was previously possible; associate the interval with one wheat sub‐genome and increase the density of SNP markers associated. Finally, we built the pipeline in iPlant, making it a user‐friendly community resource for phenotype mapping.  相似文献   

12.
13.
14.
Geographic patterns of genetic variation are shaped by multiple evolutionary processes, including genetic drift, migration and natural selection. Switchgrass (Panicum virgatum L.) has strong genetic and adaptive differentiation despite life history characteristics that promote high levels of gene flow and can homogenize intraspecific differences, such as wind‐pollination and self‐incompatibility. To better understand how historical and contemporary factors shape variation in switchgrass, we use genotyping‐by‐sequencing to characterize switchgrass from across its range at 98 042 SNPs. Population structuring reflects biogeographic and ploidy differences within and between switchgrass ecotypes and indicates that biogeographic history, ploidy incompatibilities and differential adaptation each have important roles in shaping ecotypic differentiation in switchgrass. At one extreme, we determine that two Panicum taxa are not separate species but are actually conspecific, ecologically divergent types of switchgrass adapted to the extreme conditions of coastal sand dune habitats. Conversely, we identify natural hybrids among lowland and upland ecotypes and visualize their genome‐wide patterns of admixture. Furthermore, we determine that genetic differentiation between primarily tetraploid and octoploid lineages is not caused solely by ploidy differences. Rather, genetic diversity in primarily octoploid lineages is consistent with a history of admixture. This suggests that polyploidy in switchgrass is promoted by admixture of diverged lineages, which may be important for maintaining genetic differentiation between switchgrass ecotypes where they are sympatric. These results provide new insights into the mechanisms shaping variation in widespread species and provide a foundation for dissecting the genetic basis of adaptation in switchgrass.  相似文献   

15.
Wheat breeders and academics alike use single nucleotide polymorphisms (SNP s) as molecular markers to characterize regions of interest within the hexaploid wheat genome. A number of SNP ‐based genotyping platforms are available, and their utility depends upon factors such as the available technologies, number of data points required, budgets and the technical expertise required. Unfortunately, markers can rarely be exchanged between existing and newly developed platforms, meaning that previously generated data cannot be compared, or combined, with more recently generated data sets. We predict that genotyping by sequencing will become the predominant genotyping technology within the next 5–10 years. With this in mind, to ensure that data generated from current genotyping platforms continues to be of use, we have designed and utilized SNP ‐based capture probes from several thousand existing and publicly available probes from Axiom® and KASP ? genotyping platforms. We have validated our capture probes in a targeted genotyping by sequencing protocol using 31 previously genotyped UK elite hexaploid wheat accessions. Data comparisons between targeted genotyping by sequencing, Axiom® array genotyping and KASP ? genotyping assays, identified a set of 3256 probes which reliably bring together targeted genotyping by sequencing data with the previously available marker data set. As such, these probes are likely to be of considerable value to the wheat community. The probe details, full probe sequences and a custom built analysis pipeline may be freely downloaded from the CerealsDB website (http://www.cerealsdb.uk.net/cerealgenomics/CerealsDB /sequence_capture.php).  相似文献   

16.
17.
Species delimitation has seen a paradigm shift as increasing accessibility of genomic‐scale data enables separation of lineages with convergent morphological traits and the merging of recently diverged ecotypes that have distinguishing characteristics. We inferred the process of lineage formation among Australian species in the widespread and highly variable genus Pelargonium by combining phylogenomic and population genomic analyses along with breeding system studies and character analysis. Phylogenomic analysis and population genetic clustering supported seven of the eight currently described species but provided little evidence for differences in genetic structure within the most widely distributed group that containing P. australe. In contrast, morphometric analysis detected three deep lineages within Australian Pelargonium; with P. australe consisting of five previously unrecognized entities occupying separate geographic ranges. The genomic approach enabled elucidation of parallel evolution in some traits formerly used to delineate species, as well as identification of ecotypic morphological differentiation within recognized species. Highly variable morphology and trait convergence each contribute to the discordance between phylogenomic relationships and morphological taxonomy. Data suggest that genetic divergence among species within the Australian Pelargonium may result from allopatric speciation while morphological differentiation within and among species may be more strongly driven by environmental differences.  相似文献   

18.
Next‐generation reduced representation sequencing (RRS) approaches show great potential for resolving the structure of wild populations. However, the population structure of species that have shown rapid demographic recovery following severe population bottlenecks may still prove difficult to resolve due to high gene flow between subpopulations. Here, we tested the effectiveness of the RRS method Genotyping‐By‐Sequencing (GBS) for describing the population structure of the New Zealand fur seal (NZFS, Arctocephalus forsteri), a species that was heavily exploited by the 19th century commercial sealing industry and has since rapidly recolonized most of its former range from a few isolated colonies. Using 26,026 neutral single nucleotide polymorphisms (SNPs), we assessed genetic variation within and between NZFS colonies. We identified low levels of population differentiation across the species range (<1% of variation explained by regional differences) suggesting a state of near panmixia. Nonetheless, we observed subtle population substructure between West Coast and Southern East Coast colonies and a weak, but significant (p = 0.01), isolation‐by‐distance pattern among the eight colonies studied. Furthermore, our demographic reconstructions supported severe bottlenecks with potential 10‐fold and 250‐fold declines in response to Polynesian and European hunting, respectively. Finally, we were able to assign individuals treated as unknowns to their regions of origin with high confidence (96%) using our SNP data. Our results indicate that while it may be difficult to detect population structure in species that have experienced rapid recovery, next‐generation markers and methods are powerful tools for resolving fine‐scale structure and informing conservation and management efforts.  相似文献   

19.
In a de novo genotyping‐by‐sequencing (GBS) analysis of short, 64‐base tag‐level haplotypes in 4657 accessions of cultivated oat, we discovered 164741 tag‐level (TL) genetic variants containing 241224 SNPs. From this, the marker density of an oat consensus map was increased by the addition of more than 70000 loci. The mapped TL genotypes of a 635‐line diversity panel were used to infer chromosome‐level (CL) haplotype maps. These maps revealed differences in the number and size of haplotype blocks, as well as differences in haplotype diversity between chromosomes and subsets of the diversity panel. We then explored potential benefits of SNP vs. TL vs. CL GBS variants for mapping, high‐resolution genome analysis and genomic selection in oats. A combined genome‐wide association study (GWAS) of heading date from multiple locations using both TL haplotypes and individual SNP markers identified 184 significant associations. A comparative GWAS using TL haplotypes, CL haplotype blocks and their combinations demonstrated the superiority of using TL haplotype markers. Using a principal component‐based genome‐wide scan, genomic regions containing signatures of selection were identified. These regions may contain genes that are responsible for the local adaptation of oats to Northern American conditions. Genomic selection for heading date using TL haplotypes or SNP markers gave comparable and promising prediction accuracies of up to r = 0.74. Genomic selection carried out in an independent calibration and test population for heading date gave promising prediction accuracies that ranged between r = 0.42 and 0.67. In conclusion, TL haplotype GBS‐derived markers facilitate genome analysis and genomic selection in oat.  相似文献   

20.
Targeted GBS is a recent approach for obtaining an effective characterization for hundreds to thousands of markers. The high throughput of next‐generation sequencing technologies, moreover, allows sample multiplexing. The aims of this study were to (i) define a panel of single nucleotide polymorphisms (SNPs) in the cat, (ii) use GBS for profiling 16 cats, and (iii) evaluate the performance with respect to the inference using standard approaches at different coverage thresholds, thereby providing useful information for designing similar experiments. Probes for sequencing 230 variants were designed based on the Felis_catus_8.0. 8.0 genome. The regions comprised anonymous and non‐anonymous SNPs. Sixteen cat samples were analysed, some of which had already been genotyped in a large group of loci and one having been whole‐genome sequenced in the 99_Lives Cat Genome Sequencing Project. The accuracy of the method was assessed by comparing the GBS results with the genotypes already available. Overall, GBS achieved good performance, with 92–96% correct assignments, depending on the coverage threshold used to define the set of trustable genotypes. Analyses confirmed that (i) the reliability of the inference of each genotype depends on the coverage at that locus and (ii) the fraction of target loci whose genotype can be inferred correctly is a function of the total coverage. GBS proves to be a valid alternative to other methods. Data suggested a depth of less than 11× is required for greater than 95% accuracy. However, sequencing depth must be adapted to the total size of the targets to ensure proper genotype inference.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号