首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Restriction‐site associated DNA sequencing (RAD‐seq) can identify and score thousands of genetic markers from a group of samples for population‐genetics studies. One challenge of de novo RAD‐seq analysis is to distinguish paralogous sequence variants (PSVs) from true single‐nucleotide polymorphisms (SNPs) associated with orthologous loci. In the absence of a reference genome, it is difficult to differentiate true SNPs from PSVs, and their impact on downstream analysis remains unclear. Here, we introduce a network‐based approach, PMERGE that connects fragments based on their DNA sequence similarity to identify probable PSVs. Applying our method to de novo RAD‐seq data from 150 Atlantic salmon (Salmo salar) samples collected from 15 locations across the Southern Newfoundland coast allowed the identification of 87% of total PSVs identified through alignment to the Atlantic salmon genome. Removal of these paralogs altered the inferred population structure, highlighting the potential impact of filtering in RAD‐seq analysis. PMERGE is also applied to a green crab (Carcinus maenas) data set consisting of 242 samples from 11 different locations and was successfully able to identify and remove the majority of paralogous loci (62%). The PMERGE software can be run as part of the widely used Stacks analysis package.  相似文献   

2.
Single nucleotide polymorphisms SNPs are rapidly replacing anonymous markers in population genomic studies, but their use in non model organisms is hampered by the scarcity of cost‐effective approaches to uncover genome‐wide variation in a comprehensive subset of individuals. The screening of one or only a few individuals induces ascertainment bias. To discover SNPs for a population genomic study of the Pyrenean rocket (Sisymbrium austriacum subsp. chrysanthum), we undertook a pooled RAD‐PE (Restriction site Associated DNA Paired‐End sequencing) approach. RAD tags were generated from the PstI‐digested pooled genomic DNA of 12 individuals sampled across the species distribution range and paired‐end sequenced using Illumina technology to produce ~24.5 Mb of sequences, covering ~7% of the specie's genome. Sequences were assembled into ~76 000 contigs with a mean length of 323 bp (N50 = 357 bp, sequencing depth = 24x). In all, >15 000 SNPs were called, of which 47% were annotated in putative genic regions based on homology with the Arabidopsis thaliana genome. Gene ontology (GO) slim categorization demonstrated that the identified SNPs covered extant genic variation well. The validation of 300 SNPs on a larger set of individuals using a KASPar assay underpinned the utility of pooled RAD‐PE as an inexpensive genome‐wide SNP discovery technique (success rate: 87%). In addition to SNPs, we discovered >600 putative SSR markers.  相似文献   

3.
Hybridization with introduced rainbow trout threatens most native westslope cutthroat trout populations. Understanding the genetic effects of hybridization and introgression requires a large set of high-throughput, diagnostic genetic markers to inform conservation and management. Recently, we identified several thousand candidate single-nucleotide polymorphism (SNP) markers based on RAD sequencing of 11 westslope cutthroat trout and 13 rainbow trout individuals. Here, we used flanking sequence for 56 of these candidate SNP markers to design high-throughput genotyping assays. We validated the assays on a total of 92 individuals from 22 populations and seven hatchery strains. Forty-six assays (82%) amplified consistently and allowed easy identification of westslope cutthroat and rainbow trout alleles as well as heterozygote controls. The 46 SNPs will provide high power for early detection of population admixture and improved identification of hybrid and nonhybridized individuals. This technique shows promise as a very low-cost, reliable and relatively rapid method for developing and testing SNP markers for nonmodel organisms with limited genomic resources.  相似文献   

4.
Flexibility and low cost make genotyping‐by‐sequencing (GBS) an ideal tool for population genomic studies of nonmodel species. However, to utilize the potential of the method fully, many parameters affecting library quality and single nucleotide polymorphism (SNP) discovery require optimization, especially for conifer genomes with a high repetitive DNA content. In this study, we explored strategies for effective GBS analysis in pine species. We constructed GBS libraries using HpaII, PstI and EcoRI‐MseI digestions with different multiplexing levels and examined the effect of restriction enzymes on library complexity and the impact of sequencing depth and size selection of restriction fragments on sequence coverage bias. We tested and compared UNEAK, Stacks and GATK pipelines for the GBS data, and then developed a reference‐free SNP calling strategy for haploid pine genomes. Our GBS procedure proved to be effective in SNP discovery, producing 7000–11 000 and 14 751 SNPs within and among three pine species, respectively, from a PstI library. This investigation provides guidance for the design and analysis of GBS experiments, particularly for organisms for which genomic information is lacking.  相似文献   

5.
6.
DNA sequence data were collected and screened for single nucleotide polymorphisms (SNPs) in westslope cutthroat trout (Oncorhynchus clarki lewisi) and also for substitutions that could be used to genetically discriminate rainbow trout (O. mykiss) and cutthroat trout, as well as several cutthroat trout subspecies. In total, 260 expressed sequence tag‐derived loci were sequenced and allelic discrimination genotyping assays developed from 217 of the variable sites. Another 50 putative SNPs in westslope cutthroat trout were identified by restriction‐site‐associated DNA sequencing, and seven of these were developed into assays. Twelve O. mykiss SNP assays that were variable within westslope cutthroat trout and 12 previously published SNP assays were also included in downstream testing. A total of 241 assays were tested on six westslope cutthroat trout populations (N = 32 per population), as well as collections of four other cutthroat trout subspecies and a population of rainbow trout. All assays were evaluated for reliability and deviation from Hardy–Weinberg and linkage equilibria. Poorly performing and duplicate assays were removed from the data set, and the remaining 200 assays were used in tests of population differentiation. The remaining markers easily distinguished the various subspecies tested, as evidenced by mean GST of 0.74. A smaller subset of the markers (N = 86; average GST = 0.40) was useful for distinguishing the six populations of westslope cutthroat trout. This study increases by an order of magnitude the number of genetic markers available for the study of westslope cutthroat trout and closely related taxa and includes many markers in genes (developed from ESTs).  相似文献   

7.
Restriction‐site‐associated DNA sequencing (RAD‐seq) and related methods are revolutionizing the field of population genomics in nonmodel organisms as they allow generating an unprecedented number of single nucleotide polymorphisms (SNPs) even when no genomic information is available. Yet, RAD‐seq data analyses rely on assumptions on nature and number of nucleotide variants present in a single locus, the choice of which may lead to an under‐ or overestimated number of SNPs and/or to incorrectly called genotypes. Using the Atlantic mackerel (Scomber scombrus L.) and a close relative, the Atlantic chub mackerel (Scomber colias), as case study, here we explore the sensitivity of population structure inferences to two crucial aspects in RAD‐seq data analysis: the maximum number of mismatches allowed to merge reads into a locus and the relatedness of the individuals used for genotype calling and SNP selection. Our study resolves the population structure of the Atlantic mackerel, but, most importantly, provides insights into the effects of alternative RAD‐seq data analysis strategies on population structure inferences that are directly applicable to other species.  相似文献   

8.
Research in evolutionary biology involving nonmodel organisms is rapidly shifting from using traditional molecular markers such as mtDNA and microsatellites to higher throughput SNP genotyping methodologies to address questions in population genetics, phylogenetics and genetic mapping. Restriction site associated DNA sequencing (RAD sequencing or RADseq) has become an established method for SNP genotyping on Illumina sequencing platforms. Here, we developed a protocol and adapters for double‐digest RAD sequencing for Ion Torrent (Life Technologies; Ion Proton, Ion PGM) semiconductor sequencing. We sequenced thirteen genomic libraries of three different nonmodel vertebrate species on Ion Proton with PI chips: Arctic charr Salvelinus alpinus, European whitefish Coregonus lavaretus and common lizard Zootoca vivipara. This resulted in ~962 million single‐end reads overall and a mean of ~74 million reads per library. We filtered the genomic data using Stacks, a bioinformatic tool to process RAD sequencing data. On average, we obtained ~11 000 polymorphic loci per library of 6–30 individuals. We validate our new method by technical and biological replication, by reconstructing phylogenetic relationships, and using a hybrid genetic cross to track genomic variants. Finally, we discuss the differences between using the different sequencing platforms in the context of RAD sequencing, assessing possible advantages and disadvantages. We show that our protocol can be used for Ion semiconductor sequencing platforms for the rapid and cost‐effective generation of variable and reproducible genetic markers.  相似文献   

9.
To mine possibly hidden causal single‐nucleotide polymorphisms (SNPs) of melanoma, we investigated the association of SNPs in 76 M/G1 transition genes with melanoma risk using our published genome‐wide association study (GWAS) data set with 1804 melanoma cases and 1026 cancer‐free controls. We found multiple SNPs with P < 0.01 and performed validation studies for 18 putative functional SNPs in PSMB9 in two other GWAS data sets. Two SNPs (rs1351383 and rs2127675) were associated with melanoma risk in the GenoMEL data set (P = 0.013 and 0.004, respectively), but failed in validation using the Australian data set. Genotype–phenotype analysis revealed these two SNPs were significantly correlated with mRNA expression level of PSMB9. Further experiments revealed that SNP rs2071480, which is in high LD with rs1351383 and rs2127675, may have a weak effect on the promoter activity of PSMB9. Taken together, our data suggested that functional variants in PSMB9 may contribute to melanoma susceptibility.  相似文献   

10.
Single nucleotide polymorphisms (SNPs) are becoming more commonly used as molecular markers in conservation studies. However, relatively few studies have employed SNPs for species with little or no existing sequence data, partly due to the practical challenge of locating appropriate SNP loci in these species. Here we describe an application of SNP discovery via shotgun cloning that requires no pre-existing sequence data and is readily applied to all taxa. Using this method, we isolated, cloned and screened for SNP variation at 90 anonymous sequence loci (51 kb total) from the banded wren (Thryothorus pleurostictus), a Central American species with minimal pre-existing sequence data and a documented paucity of microsatellite allelic variation. We identified 168 SNPs (a mean of one SNP/305 bp, with SNPs unevenly distributed across loci). Further characterization of variation at 41 of these SNP loci among 256 individuals including 37 parent–offspring families suggests that they provide substantial information for defining the genetic mating system of this species, and that SNPs may be generally useful for this purpose when other markers are problematic.  相似文献   

11.
Recent advances in high‐throughput sequencing technologies have offered the possibility to generate genomewide sequence data to delineate previously unidentified genetic structure, obtain more accurate estimates of demographic parameters and to evaluate potential adaptive divergence. Here, we identified 27 556 single nucleotide polymorphisms for the small yellow croaker (Larimichthys polyactis) using restriction‐site‐associated DNA (RAD) sequencing of 24 individuals from two populations. Significant sources of genetic variation were identified, with an average nucleotide diversity (π) of 0.00105 ± 0.000425 across individuals, and long‐term effective population size was thus estimated to range between 26 172 and 261 716. According to the results, no differentiation between the two populations was detected based on the SNP data set of top quality score per contig or neutral loci. However, the two analysed populations were highly differentiated based on SNP data set of both top FST value per contig and the outlier SNPs. Moreover, local adaptation was highlighted by an FST‐based outlier tests implemented in LOSITAN and a total of 538 potentially locally selected SNPs were identified. blast2go annotation of contigs containing the outlier SNPs yielded hits for 37 (66%) of 56 significant blastx matches. Candidate genes for local adaptation constituted a wide array of biological functions, including cellular response to oxidative stress, actin filament binding, ion transmembrane transport and synapse assembly. The generated SNP resources in this study provided a valuable tool for future population genetics and genomics studies of L. polyactis.  相似文献   

12.
13.
The increased numbers of genetic markers produced by genomic techniques have the potential to both identify hybrid individuals and localize chromosomal regions responding to selection and contributing to introgression. We used restriction-site-associated DNA sequencing to identify a dense set of candidate SNP loci with fixed allelic differences between introduced rainbow trout (Oncorhynchus mykiss) and native westslope cutthroat trout (Oncorhynchus clarkii lewisi). We distinguished candidate SNPs from homeologs (paralogs resulting from whole-genome duplication) by detecting excessively high observed heterozygosity and deviations from Hardy-Weinberg proportions. We identified 2923 candidate species-specific SNPs from a single Illumina sequencing lane containing 24 barcode-labelled individuals. Published sequence data and ongoing genome sequencing of rainbow trout will allow physical mapping of SNP loci for genome-wide scans and will also provide flanking sequence for design of qPCR-based TaqMan(?) assays for high-throughput, low-cost hybrid identification using a subset of 50-100 loci. This study demonstrates that it is now feasible to identify thousands of informative SNPs in nonmodel species quickly and at reasonable cost, even if no prior genomic information is available.  相似文献   

14.
Mitochondrial DNA (mtDNA) has formed the backbone of phylogeographic research for many years; however, recent trends focus on genome‐wide analyses. One method proposed for calibrating inferences from noisy next‐generation data, such as RAD sequencing, is to compare these results with analyses of mitochondrial sequences. Most researchers using this approach appear to be unaware that many single nucleotide polymorphisms (SNPs) identified from genome‐wide sequence data are themselves mitochondrial, or assume that these are too few to bias analyses. Here, we demonstrate two methods for mining mitochondrial markers using RAD sequence data from three South African species of yellowfish, Labeobarbus. First, we use a rigorous SNP discovery pipeline using the program stacks , to identify variant sites in mtDNA, which we then combine into haplotypes. Second, we directly map sequence reads against a mitochondrial genome reference. This method allowed us to reconstruct up to 98% of the Labeobarbus mitogenome. We validated these mitogenome reconstructions through blast database searches and by comparison with cytochrome b gene sequences obtained through Sanger sequencing. Finally, we investigate the organismal consequences of these data including ancient genetic exchange and a recent translocation among populations of L. natalensis, as well as interspecific hybridization between L. aeneus and L. kimberleyensis.  相似文献   

15.
Marker development for marker‐assisted selection in plant breeding is increasingly based on next‐generation sequencing (NGS). However, marker development in crops with highly repetitive, complex genomes is still challenging. Here we applied sequence‐based genotyping (SBG), which couples AFLP®‐based complexity reduction to NGS, for de novo single nucleotide polymorphisms (SNP) marker discovery in and genotyping of a biparental durum wheat population. We identified 9983 putative SNPs in 6372 contigs between the two parents and used these SNPs for genotyping 91 recombinant inbred lines (RILs). Excluding redundant information from multiple SNPs per contig, 2606 (41%) markers were used for integration in a pre‐existing framework map, resulting in the integration of 2365 markers over 2607 cM. Of the 2606 markers available for mapping, 91% were integrated in the pre‐existing map, containing 708 SSRs, DArT markers, and SNPs from CRoPS technology, with a map‐size increase of 492 cM (23%). These results demonstrate the high quality of the discovered SNP markers. With this methodology, it was possible to saturate the map at a final marker density of 0.8 cM/marker. Looking at the binned marker distribution (Figure 2), 63 of the 268 10‐cM bins contained only SBG markers, showing that these markers are filling in gaps in the framework map. As to the markers that could not be used for mapping, the main reason was the low sequencing coverage used for genotyping. We conclude that SBG is a valuable tool for efficient, high‐throughput and high‐quality marker discovery and genotyping for complex genomes such as that of durum wheat.  相似文献   

16.
With the advent of next generation sequencing, new avenues have opened to study genomics in wild populations of non‐model species. Here, we describe a successful approach to a genome‐wide medium density Single Nucleotide Polymorphism (SNP) panel in a non‐model species, the house sparrow (Passer domesticus), through the development of a 10 K Illumina iSelect HD BeadChip. Genomic DNA and cDNA derived from six individuals were sequenced on a 454 GS FLX system and generated a total of 1.2 million sequences, in which SNPs were detected. As no reference genome exists for the house sparrow, we used the zebra finch (Taeniopygia guttata) reference genome to determine the most likely position of each SNP. The 10 000 SNPs on the SNP‐chip were selected to be distributed evenly across 31 chromosomes, giving on average one SNP per 100 000 bp. The SNP‐chip was screened across 1968 individual house sparrows from four island populations. Of the original 10 000 SNPs, 7413 were found to be variable, and 99% of these SNPs were successfully called in at least 93% of all individuals. We used the SNP‐chip to demonstrate the ability of such genome‐wide marker data to detect population sub‐division, and compared these results to similar analyses using microsatellites. The SNP‐chip will be used to map Quantitative Trait Loci (QTL) for fitness‐related phenotypic traits in natural populations.  相似文献   

17.
Blue catfish, Ictalurus furcatus, are valued in the United States as a trophy fishery for their capacity to reach large sizes, sometimes exceeding 45 kg. Additionally, blue catfish × channel catfish (I. punctatus) hybrid food fish production has recently increased the demand for blue catfish broodstock. However, there has been little study of the genetic impacts and interaction of farmed, introduced and stocked populations of blue catfish. We utilized genotyping‐by‐sequencing (GBS) to capture and genotype SNP markers on 190 individuals from five wild and domesticated populations (Mississippi River, Missouri, D&B, Rio Grande and Texas). Stringent filtering of SNP‐calling parameters resulted in 4275 SNP loci represented across all five populations. Population genetics and structure analyses revealed potential shared ancestry and admixture between populations. We utilized the Sequenom MassARRAY to validate two multiplex panels of SNPs selected from the GBS data. Selection criteria included SNPs shared between populations, SNPs specific to populations, number of reads per individual and number of individuals genotyped by GBS. Putative SNPs were validated in the discovery population and in two additional populations not used in the GBS analysis. A total of 64 SNPs were genotyped successfully in 191 individuals from nine populations. Our results should guide the development of highly informative, flexible genotyping multiplexes for blue catfish from the larger GBS SNP set as well as provide an example of a rapid, low‐cost approach to generate and genotype informative marker loci in aquatic species with minimal previous genetic information.  相似文献   

18.
19.
Extensive genomic resources are available in the model legume Medicago truncatula. Here, we present the discovery and design of the first array of single‐nucleotide polymorphism (SNP) markers in M. truncatula through large‐scale Sanger resequencing of genomic fragments spanning the genome, in a diverse panel of 16 M. truncatula accessions. Both anonymous fragments and fragments targeting candidate genes for flowering phenology and symbiosis were surveyed for nucleotide variation in almost 230 kb of unique genomic regions. A set of 384 SNP markers was designed for an Illumina's GoldenGate assay, genotyped on a collection of 192 inbred lines (CC192) representing the geographical range of the species and used to survey the diversity of two natural populations. Finally, 86% of the tested SNPs were of high quality and exhibited polymorphism in the CC192 collection. Even at the population level, we detected polymorphism for more than 50% of the selected SNPs. Analysis of the allele frequency spectrum in the CC192 showed a reduced ascertainment bias, mostly limited to very rare alleles (frequency <0.01). The substantial polymorphism detected at the species and population levels, the high marker quality and the potential to survey large samples of individuals make this set of SNP markers a valuable tool to improve our understanding of the effect of demographic and selective factors that shape the natural genetic diversity within the selfing species Medicago truncatula.  相似文献   

20.
Estimating the evolutionary potential of quantitative traits and reliably predicting responses to selection in wild populations are important challenges in evolutionary biology. The genomic revolution has opened up opportunities for measuring relatedness among individuals with precision, enabling pedigree‐free estimation of trait heritabilities in wild populations. However, until now, most quantitative genetic studies based on a genomic relatedness matrix (GRM) have focused on long‐term monitored populations for which traditional pedigrees were also available, and have often had access to knowledge of genome sequence and variability. Here, we investigated the potential of RAD‐sequencing for estimating heritability in a free‐ranging roe deer (Capreolous capreolus) population for which no prior genomic resources were available. We propose a step‐by‐step analytical framework to optimize the quality and quantity of the genomic data and explore the impact of the single nucleotide polymorphism (SNP) calling and filtering processes on the GRM structure and GRM‐based heritability estimates. As expected, our results show that sequence coverage strongly affects the number of recovered loci, the genotyping error rate and the amount of missing data. Ultimately, this had little effect on heritability estimates and their standard errors, provided that the GRM was built from a minimum number of loci (above 7,000). Genomic relatedness matrix‐based heritability estimates thus appear robust to a moderate level of genotyping errors in the SNP data set. We also showed that quality filters, such as the removal of low‐frequency variants, affect the relatedness structure of the GRM, generating lower h2 estimates. Our work illustrates the huge potential of RAD‐sequencing for estimating GRM‐based heritability in virtually any natural population.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号