首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Single nucleotide polymorphisms SNPs are rapidly replacing anonymous markers in population genomic studies, but their use in non model organisms is hampered by the scarcity of cost‐effective approaches to uncover genome‐wide variation in a comprehensive subset of individuals. The screening of one or only a few individuals induces ascertainment bias. To discover SNPs for a population genomic study of the Pyrenean rocket (Sisymbrium austriacum subsp. chrysanthum), we undertook a pooled RAD‐PE (Restriction site Associated DNA Paired‐End sequencing) approach. RAD tags were generated from the PstI‐digested pooled genomic DNA of 12 individuals sampled across the species distribution range and paired‐end sequenced using Illumina technology to produce ~24.5 Mb of sequences, covering ~7% of the specie's genome. Sequences were assembled into ~76 000 contigs with a mean length of 323 bp (N50 = 357 bp, sequencing depth = 24x). In all, >15 000 SNPs were called, of which 47% were annotated in putative genic regions based on homology with the Arabidopsis thaliana genome. Gene ontology (GO) slim categorization demonstrated that the identified SNPs covered extant genic variation well. The validation of 300 SNPs on a larger set of individuals using a KASPar assay underpinned the utility of pooled RAD‐PE as an inexpensive genome‐wide SNP discovery technique (success rate: 87%). In addition to SNPs, we discovered >600 putative SSR markers.  相似文献   

2.
Domestication and commercial production of the grasscutter, Thryonomys swinderianus, a large rodent, represents an important opportunity to secure sustainable animal protein for local communities in West Africa. To support production, DNA markers are required for population diversity assessment, pedigree analysis and marker‐assisted selection. This study reports the application of double‐digest RAD sequencing to simultaneously discover and genotype SNP markers in 24 wild and recently domesticated grasscutters. An initial panel of 1209 SNP loci was characterised from a total of more than 21 000 candidate loci containing single SNPs. This genome‐wide resource represents the first application of its type to commercial production of a large rodent for food and advances the use of agricultural genomics in Ghana.  相似文献   

3.
With the advent of next generation sequencing, new avenues have opened to study genomics in wild populations of non‐model species. Here, we describe a successful approach to a genome‐wide medium density Single Nucleotide Polymorphism (SNP) panel in a non‐model species, the house sparrow (Passer domesticus), through the development of a 10 K Illumina iSelect HD BeadChip. Genomic DNA and cDNA derived from six individuals were sequenced on a 454 GS FLX system and generated a total of 1.2 million sequences, in which SNPs were detected. As no reference genome exists for the house sparrow, we used the zebra finch (Taeniopygia guttata) reference genome to determine the most likely position of each SNP. The 10 000 SNPs on the SNP‐chip were selected to be distributed evenly across 31 chromosomes, giving on average one SNP per 100 000 bp. The SNP‐chip was screened across 1968 individual house sparrows from four island populations. Of the original 10 000 SNPs, 7413 were found to be variable, and 99% of these SNPs were successfully called in at least 93% of all individuals. We used the SNP‐chip to demonstrate the ability of such genome‐wide marker data to detect population sub‐division, and compared these results to similar analyses using microsatellites. The SNP‐chip will be used to map Quantitative Trait Loci (QTL) for fitness‐related phenotypic traits in natural populations.  相似文献   

4.
Next‐generation sequencing and the collection of genome‐wide data allow identifying adaptive variation and footprints of directional selection. Using a large SNP data set from 259 RAD‐sequenced European eel individuals (glass eels) from eight locations between 34 and 64oN, we examined the patterns of genome‐wide genetic diversity across locations. We tested for local selection by searching for increased population differentiation using FST‐based outlier tests and by testing for significant associations between allele frequencies and environmental variables. The overall low genetic differentiation found (FST = 0.0007) indicates that most of the genome is homogenized by gene flow, providing further evidence for genomic panmixia in the European eel. The lack of genetic substructuring was consistent at both nuclear and mitochondrial SNPs. Using an extensive number of diagnostic SNPs, results showed a low occurrence of hybrids between European and American eel, mainly limited to Iceland (5.9%), although individuals with signatures of introgression several generations back in time were found in mainland Europe. Despite panmixia, a small set of SNPs showed high genetic differentiation consistent with single‐generation signatures of spatially varying selection acting on glass eels. After screening 50 354 SNPs, a total of 754 potentially locally selected SNPs were identified. Candidate genes for local selection constituted a wide array of functions, including calcium signalling, neuroactive ligand–receptor interaction and circadian rhythm. Remarkably, one of the candidate genes identified is PERIOD, possibly related to differences in local photoperiod associated with the >30° difference in latitude between locations. Genes under selection were spread across the genome, and there were no large regions of increased differentiation as expected when selection occurs within just a single generation due to panmixia. This supports the conclusion that most of the genome is homogenized by gene flow that removes any effects of diversifying selection from each new generation.  相似文献   

5.
Recent advances in high‐throughput sequencing technologies have offered the possibility to generate genomewide sequence data to delineate previously unidentified genetic structure, obtain more accurate estimates of demographic parameters and to evaluate potential adaptive divergence. Here, we identified 27 556 single nucleotide polymorphisms for the small yellow croaker (Larimichthys polyactis) using restriction‐site‐associated DNA (RAD) sequencing of 24 individuals from two populations. Significant sources of genetic variation were identified, with an average nucleotide diversity (π) of 0.00105 ± 0.000425 across individuals, and long‐term effective population size was thus estimated to range between 26 172 and 261 716. According to the results, no differentiation between the two populations was detected based on the SNP data set of top quality score per contig or neutral loci. However, the two analysed populations were highly differentiated based on SNP data set of both top FST value per contig and the outlier SNPs. Moreover, local adaptation was highlighted by an FST‐based outlier tests implemented in LOSITAN and a total of 538 potentially locally selected SNPs were identified. blast2go annotation of contigs containing the outlier SNPs yielded hits for 37 (66%) of 56 significant blastx matches. Candidate genes for local adaptation constituted a wide array of biological functions, including cellular response to oxidative stress, actin filament binding, ion transmembrane transport and synapse assembly. The generated SNP resources in this study provided a valuable tool for future population genetics and genomics studies of L. polyactis.  相似文献   

6.
Whole genome resequencing of 51 Populus nigra (L.) individuals from across Western Europe was performed using Illumina platforms. A total number of 1 878 727 SNPs distributed along the P. nigra reference sequence were identified. The SNP calling accuracy was validated with Sanger sequencing. SNPs were selected within 14 previously identified QTL regions, 2916 expressional candidate genes related to rust resistance, wood properties, water‐use efficiency and bud phenology and 1732 genes randomly spread across the genome. Over 10 000 SNPs were selected for the construction of a 12k Infinium Bead‐Chip array dedicated to association mapping. The SNP genotyping assay was performed with 888 P. nigra individuals. The genotyping success rate was 91%. Our high success rate was due to the discovery panel design and the stringent parameters applied for SNP calling and selection. In the same set of P. nigra genotypes, linkage disequilibrium throughout the genome decayed on average within 5–7 kb to half of its maximum value. As an application test, ADMIXTURE analysis was performed with a selection of 600 SNPs spread throughout the genome and 706 individuals collected along 12 river basins. The admixture pattern was consistent with genetic diversity revealed by neutral markers and the geographical distribution of the populations. These newly developed SNP resources and genotyping array provide a valuable tool for population genetic studies and identification of QTLs through natural‐population based genetic association studies in P. nigra.  相似文献   

7.
Here, we present an adaptation of restriction‐site‐associated DNA sequencing (RAD‐seq) to the Illumina HiSeq2000 technology that we used to produce SNP markers in very large quantities at low cost per unit in the Réunion grey white‐eye (Zosterops borbonicus), a nonmodel passerine bird species with no reference genome. We sequenced a set of six pools of 18–25 individuals using a single sequencing lane. This allowed us to build around 600 000 contigs, among which at least 386 000 could be mapped to the zebra finch (Taeniopygia guttata) genome. This yielded more than 80 000 SNPs that could be mapped unambiguously and are evenly distributed across the genome. Thus, our approach provides a good illustration of the high potential of paired‐end RAD sequencing of pooled DNA samples combined with comparative assembly to the zebra finch genome to build large contigs and characterize vast numbers of informative SNPs in nonmodel passerine bird species in a very efficient and cost‐effective way.  相似文献   

8.
Population genetic studies in nonmodel organisms are often hampered by a lack of reference genomes that are essential for whole‐genome resequencing. In the light of this, genotyping methods have been developed to effectively eliminate the need for a reference genome, such as genotyping by sequencing or restriction site‐associated DNA sequencing (RAD‐seq). However, what remains relatively poorly studied is how accurately these methods capture both average and variation in genetic diversity across an organism's genome. In this issue of Molecular Ecology Resources, Dutoit et al. (2016) use whole‐genome resequencing data from the collard flycatcher to assess what factors drive heterogeneity in nucleotide diversity across the genome. Using these data, they then simulate how well different sequencing designs, including RAD sequencing, could capture most of the variation in genetic diversity. They conclude that for evolutionary and conservation‐related studies focused on the estimating genomic diversity, researchers should emphasize the number of loci analysed over the number of individuals sequenced.  相似文献   

9.
In a de novo genotyping‐by‐sequencing (GBS) analysis of short, 64‐base tag‐level haplotypes in 4657 accessions of cultivated oat, we discovered 164741 tag‐level (TL) genetic variants containing 241224 SNPs. From this, the marker density of an oat consensus map was increased by the addition of more than 70000 loci. The mapped TL genotypes of a 635‐line diversity panel were used to infer chromosome‐level (CL) haplotype maps. These maps revealed differences in the number and size of haplotype blocks, as well as differences in haplotype diversity between chromosomes and subsets of the diversity panel. We then explored potential benefits of SNP vs. TL vs. CL GBS variants for mapping, high‐resolution genome analysis and genomic selection in oats. A combined genome‐wide association study (GWAS) of heading date from multiple locations using both TL haplotypes and individual SNP markers identified 184 significant associations. A comparative GWAS using TL haplotypes, CL haplotype blocks and their combinations demonstrated the superiority of using TL haplotype markers. Using a principal component‐based genome‐wide scan, genomic regions containing signatures of selection were identified. These regions may contain genes that are responsible for the local adaptation of oats to Northern American conditions. Genomic selection for heading date using TL haplotypes or SNP markers gave comparable and promising prediction accuracies of up to r = 0.74. Genomic selection carried out in an independent calibration and test population for heading date gave promising prediction accuracies that ranged between r = 0.42 and 0.67. In conclusion, TL haplotype GBS‐derived markers facilitate genome analysis and genomic selection in oat.  相似文献   

10.
Single nucleotide polymorphisms (SNPs) are essential to the understanding of population genetic variation and diversity. Here, we performed restriction‐site‐associated DNA sequencing (RAD‐seq) on 72 individuals from 13 Chinese indigenous and three introduced chicken breeds. A total of 620 million reads were obtained using an Illumina Hiseq2000 sequencer. An average of 75 587 SNPs were identified from each individual. Further filtering strictly validated 28 895 SNPs candidates for all populations. When compared with the NCBI dbSNP (chicken_9031), 15 404 SNPs were new discoveries. In this study, RAD‐seq was performed for the first time on chickens, implicating the remarkable effectiveness and potential applications on genetic analysis and breeding technique for whole‐genome selection in chicken and other agricultural animals.  相似文献   

11.
Molecular markers produced by next‐generation sequencing (NGS) technologies are revolutionizing genetic research. However, the costs of analysing large numbers of individual genomes remain prohibitive for most population genetics studies. Here, we present results based on mathematical derivations showing that, under many realistic experimental designs, NGS of DNA pools from diploid individuals allows to estimate the allele frequencies at single nucleotide polymorphisms (SNPs) with at least the same accuracy as individual‐based analyses, for considerably lower library construction and sequencing efforts. These findings remain true when taking into account the possibility of substantially unequal contributions of each individual to the final pool of sequence reads. We propose the intuitive notion of effective pool size to account for unequal pooling and derive a Bayesian hierarchical model to estimate this parameter directly from the data. We provide a user‐friendly application assessing the accuracy of allele frequency estimation from both pool‐ and individual‐based NGS population data under various sampling, sequencing depth and experimental error designs. We illustrate our findings with theoretical examples and real data sets corresponding to SNP loci obtained using restriction site–associated DNA (RAD) sequencing in pool‐ and individual‐based experiments carried out on the same population of the pine processionary moth (Thaumetopoea pityocampa). NGS of DNA pools might not be optimal for all types of studies but provides a cost‐effective approach for estimating allele frequencies for very large numbers of SNPs. It thus allows comparison of genome‐wide patterns of genetic variation for large numbers of individuals in multiple populations.  相似文献   

12.
Properly estimating genetic diversity in populations of nonmodel species requires a basic understanding of how diversity is distributed across the genome and among individuals. To this end, we analysed whole‐genome resequencing data from 20 collared flycatchers (genome size ≈1.1 Gb; 10.13 million single nucleotide polymorphisms detected). Genomewide nucleotide diversity was almost identical among individuals (mean = 0.00394, range = 0.00384–0.00401), but diversity levels varied extensively across the genome (95% confidence interval for 200‐kb windows = 0.0013–0.0053). Diversity was related to selective constraint such that in comparison with intergenic DNA, diversity at fourfold degenerate sites was reduced to 85%, 3′ UTRs to 82%, 5′ UTRs to 70% and nondegenerate sites to 12%. There was a strong positive correlation between diversity and chromosome size, probably driven by a higher density of targets for selection on smaller chromosomes increasing the diversity‐reducing effect of linked selection. Simulations exploring the ability of sequence data from a small number of genetic markers to capture the observed diversity clearly demonstrated that diversity estimation from finite sampling of such data is bound to be associated with large confidence intervals. Nevertheless, we show that precision in diversity estimation in large outbred population benefits from increasing the number of loci rather than the number of individuals. Simulations mimicking RAD sequencing showed that this approach gives accurate estimates of genomewide diversity. Based on the patterns of observed diversity and the performed simulations, we provide broad recommendations for how genetic diversity should be estimated in natural populations.  相似文献   

13.
Genetic relatedness of 24 animals belonging to seven Indian cattle breeds was studied using high throughput genotyping‐by‐sequencing (GBS) markers. GBS produced 93.6 million reads with an average of about 3.9 million reads per animal. A total of 107 488 SNPs were identified in these individuals. When only one SNP per read was considered, a total of 60 261 SNPs representing independent reads were identified with an average SNP‐to‐SNP distance of 45 kb across the bovine reference genome. About 24% of the GBS‐SNP markers were more than 100 kb apart. Of these, 58 322 SNPs mapped to autosomes, 1645 to the X chromosome and 28 to the Y chromosome. The average SNP‐to‐SNP distance on the X chromosome was 91.3 kb, whereas on the Y chromosome it was 1546.4 kb. The minor allele frequency within the Indian cattle varied from 0.103 (Ongole) to 0.177 (Siri), whereas Holstein cattle had the lowest value of 0.089. This is the first application of GBS in cattle of South Asia. The baseline information generated in this study might prompt implementation of GBS in breeding of cattle belonging to this region.  相似文献   

14.
Salmonid genomes are considered to be in a pseudo‐tetraploid state as a result of a genome duplication event that occurred between 25 and 100 Ma. This situation complicates single‐nucleotide polymorphism (SNP) discovery in rainbow trout as many putative SNPs are actually paralogous sequence variants (PSVs) and not simple allelic variants. To differentiate PSVs from simple allelic variants, we used 19 homozygous doubled haploid (DH) lines that represent a wide geographical range of rainbow trout populations. In the first phase of the study, we analysed SbfI restriction‐site associated DNA (RAD) sequence data from all the 19 lines and selected 11 lines for an extended SNP discovery. In the second phase, we conducted the extended SNP discovery using PstI RAD sequence data from the selected 11 lines. The complete data set is composed of 145 168 high‐quality putative SNPs that were genotyped in at least nine of the 11 lines, of which 71 446 (49%) had minor allele frequencies (MAF) of at least 18% (i.e. at least two of the 11 lines). Approximately 14% of the RAD SNPs in this data set are from expressed or coding rainbow trout sequences. Our comparison of the current data set with previous SNP discovery data sets revealed that 99% of our SNPs are novel. In the support files for this resource, we provide annotation to the positions of the SNPs in the working draft of the rainbow trout reference genome, provide the genotypes of each sample in the discovery panel and identify SNPs that are likely to be in coding sequences.  相似文献   

15.
The rapid development and application of molecular marker assays have facilitated genomic selection and genome‐wide linkage and association studies in wheat breeding. Although PCR‐based markers (e.g. simple sequence repeats and functional markers) and genotyping by sequencing have contributed greatly to gene discovery and marker‐assisted selection, the release of a more accurate and complete bread wheat reference genome has resulted in the design of single‐nucleotide polymorphism (SNP) arrays based on different densities or application targets. Here, we evaluated seven types of wheat SNP arrays in terms of their SNP number, distribution, density, associated genes, heterozygosity and application. The results suggested that the Wheat 660K SNP array contained the highest percentage (99.05%) of genome‐specific SNPs with reliable physical positions. SNP density analysis indicated that the SNPs were almost evenly distributed across the whole genome. In addition, 229 266 SNPs in the Wheat 660K SNP array were located in 66 834 annotated gene or promoter intervals. The annotated genes revealed by the Wheat 660K SNP array almost covered all genes revealed by the Wheat 35K (97.44%), 55K (99.73%), 90K (86.9%) and 820K (85.3%) SNP arrays. Therefore, the Wheat 660K SNP array could act as a substitute for other 6 arrays and shows promise for a wide range of possible applications. In summary, the Wheat 660K SNP array is reliable and cost‐effective and may be the best choice for targeted genotyping and marker‐assisted selection in wheat genetic improvement.  相似文献   

16.
Marine systems have traditionally been thought of as “open” with few barriers to gene flow. In particular, many marine organisms in the Southern Ocean purportedly possess circumpolar distributions that have rarely been well verified. Here, we use the highly abundant and endemic Southern Ocean brittle star Ophionotus victoriae to examine genetic structure and determine whether barriers to gene flow have existed around the Antarctic continent. Ophionotus victoriae possesses feeding planktotrophic larvae with presumed high dispersal capability, but a previous study revealed genetic structure along the Antarctic Peninsula. To test the extent of genetic differentiation within O. victoriae, we sampled from the Ross Sea through the eastern Weddell Sea. Whereas two mitochondrial DNA markers (16S rDNA and COI) were employed to allow comparison to earlier work, a 2b‐RAD single‐nucleotide polymorphism (SNP) approach allowed sampling of loci across the genome. Mitochondrial data from 414 individuals suggested three major lineages, but 2b‐RAD data generated 1,999 biallelic loci that identified four geographically distinct groups from 89 samples. Given the greater resolution by SNP data, O. victoriae can be divided into geographically distinct populations likely representing multiple species. Specific historical scenarios that explain current population structure were examined with approximate Bayesian computation (ABC) analyses. Although the Bransfield Strait region shows high diversity possibly due to mixing, our results suggest that within the recent past, dispersal processes due to strong currents such as the Antarctic Circumpolar Current have not overcome genetic subdivision presumably due to historical isolation, questioning the idea of large open circumpolar populations in the Southern Ocean.  相似文献   

17.
Research in evolutionary biology involving nonmodel organisms is rapidly shifting from using traditional molecular markers such as mtDNA and microsatellites to higher throughput SNP genotyping methodologies to address questions in population genetics, phylogenetics and genetic mapping. Restriction site associated DNA sequencing (RAD sequencing or RADseq) has become an established method for SNP genotyping on Illumina sequencing platforms. Here, we developed a protocol and adapters for double‐digest RAD sequencing for Ion Torrent (Life Technologies; Ion Proton, Ion PGM) semiconductor sequencing. We sequenced thirteen genomic libraries of three different nonmodel vertebrate species on Ion Proton with PI chips: Arctic charr Salvelinus alpinus, European whitefish Coregonus lavaretus and common lizard Zootoca vivipara. This resulted in ~962 million single‐end reads overall and a mean of ~74 million reads per library. We filtered the genomic data using Stacks, a bioinformatic tool to process RAD sequencing data. On average, we obtained ~11 000 polymorphic loci per library of 6–30 individuals. We validate our new method by technical and biological replication, by reconstructing phylogenetic relationships, and using a hybrid genetic cross to track genomic variants. Finally, we discuss the differences between using the different sequencing platforms in the context of RAD sequencing, assessing possible advantages and disadvantages. We show that our protocol can be used for Ion semiconductor sequencing platforms for the rapid and cost‐effective generation of variable and reproducible genetic markers.  相似文献   

18.
Data from a large‐scale restriction site‐associated DNA sequencing (RAD‐Seq) study of nine butterflyfish species in the Red Sea and Arabian Sea provided a means to test the utility of a recently published draft genome (Chaetodon austriacus) and assess apparent bias in this method of isolating nuclear loci. We here processed double‐digest restriction site‐associated DNA (ddRAD) sequencing data to identify single nucleotide polymorphism (SNP) markers and their associated function with and without our reference genome to see whether it improves the quality of RAD‐Seq. Our analyses indicate (i) a modest gap between the number of nonannotated versus annotated SNPs across all species, (ii) an advantage of using genomic resources for closely related but not distantly related butterflyfish species based on the ability to assign putative gene function to SNPs and (iii) an enrichment of genes among sister butterflyfish taxa related to calcium transmembrane transport and binding. The latter result highlights the potential for this approach to reveal insights into adaptive mechanisms in populations inhabiting challenging coral reef environments such as the Red Sea, Arabian Sea and Arabian Gulf with further study.  相似文献   

19.
Restriction‐site‐associated DNA tag (RAD‐tag) sequencing has become a popular approach to generate thousands of SNPs used to address diverse questions in population genomics. Comparatively, the suitability of RAD‐tag genotyping to address evolutionary questions across divergent species has been the subject of only a few recent studies. Here, we evaluate the applicability of this approach to conduct genome‐wide scans for polymorphisms across two cetacean species belonging to distinct families: the short‐beaked common dolphin (Delphinus delphis; n = 5 individuals) and the harbour porpoise (Phocoena phocoena; n = 1 individual). Additionally, we explore the effects of varying two parameters in the Stacks analysis pipeline on the number of loci and level of divergence obtained. We observed a 34% drop in the total number of loci that were present in all individuals when analysing individuals from the distinct families compared with analyses restricted to intraspecific comparisons (i.e. within D. delphis). Despite relatively stringent quality filters, 3595 polymorphic loci were retrieved from our interfamilial comparison. Cetaceans have undergone rapid diversification, and the estimated divergence time between the two families is relatively recent (14–19 Ma). Thus, our results showed that, for this level of divergence, a large number of orthologous loci can still be genotyped using this approach, which is on par with two recent in silico studies. Our findings constitute one of the first empirical investigations using RAD‐tag sequencing at this level of divergence and highlights the great potential of this approach in comparative studies and to address evolutionary questions.  相似文献   

20.
The maintenance or breakdown of reproductive isolation is an observable outcome of secondary contact between species. In cases where hybrids beyond the F1 are formed, the representation of each species' ancestry can vary dramatically among genomic regions. This genomic heterogeneity in ancestry and introgression can offer insight into evolutionary processes, particularly if introgression is compared in multiple hybrid zones. Similarly, considerable heterogeneity exists across the genome in the extent to which populations and species have diverged, reflecting the combined effects of different evolutionary processes on genetic variation. We studied hybridization across two hybrid zones of two phenotypically well‐differentiated bird species in Mexico (Pipilo maculatus and P. ocai), to investigate genomic heterogeneity in differentiation and introgression. Using genotyping‐by‐sequencing (GBS) and hierarchical Bayesian models, we genotyped 460 birds at over 41 000 single nucleotide polymorphism (SNP) loci. We identified loci exhibiting extreme introgression relative to the genome‐wide expectation using a Bayesian genomic cline model. We also estimated locus‐specific FST and identified loci with exceptionally high genetic divergence between the parental species. We found some concordance of locus‐specific introgression in the two independent hybrid zones (6–20% of extreme loci shared across zones), reflecting areas of the genome that experience similar gene flow when the species interact. Additionally, heterogeneity in introgression and divergence across the genome revealed another subset of loci under the influence of locally specific factors. These results are consistent with a history in which reproductive isolation has been influenced by a common set of loci in both hybrid zones, but where local environmental and stochastic factors also lead to genomic differentiation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号