期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Genotyping‐in‐Thousands by sequencing (GT‐seq) panel development and application to minimally invasive DNA samples to support studies in molecular ecology

Danielle A. Schmidt Nathan R. Campbell Purnima Govindarajulu Karl W. Larsen Michael A. Russello 《Molecular ecology resources》2020,20(1):114-124

Minimally invasive sampling (MIS) is widespread in wildlife studies; however, its utility for massively parallel DNA sequencing (MPS) is limited. Poor sample quality and contamination by exogenous DNA can make MIS challenging to use with modern genotyping‐by‐sequencing approaches, which have been traditionally developed for high‐quality DNA sources. Given that MIS is often more appropriate in many contexts, there is a need to make such samples practical for harnessing MPS. Here, we test the ability for Genotyping‐in‐Thousands by sequencing (GT‐seq), a multiplex amplicon sequencing approach, to effectively genotype minimally invasive cloacal DNA samples collected from the Western Rattlesnake (Crotalus oreganus), a threatened species in British Columbia, Canada. As there was no previous genetic information for this species, an optimized panel of 362 SNPs was selected for use with GT‐seq from a de novo restriction site‐associated DNA sequencing (RADseq) assembly. Comparisons of genotypes generated within and among RADseq and GT‐seq for the same individuals found low rates of genotyping error (GT‐seq: 0.50%; RADseq: 0.80%) and discordance (2.57%), the latter likely due to the different genotype calling models employed. GT‐seq mean genotype discordance between blood and cloacal swab samples collected from the same individuals was also minimal (1.37%). Estimates of population diversity parameters were similar across GT‐seq and RADseq data sets, as were inferred patterns of population structure. Overall, GT‐seq can be effectively applied to low‐quality DNA samples, minimizing the inefficiencies presented by exogenous DNA typically found in minimally invasive samples and continuing the expansion of molecular ecology and conservation genetics in the genomics era. 相似文献

2.

Testing genotyping strategies for ultra‐deep sequencing of a co‐amplifying gene family: MHC class I in a passerine bird

下载免费PDF全文

Aleksandra Biedrzycka Alvaro Sebastian Magdalena Migalska Helena Westerdahl Jacek Radwan 《Molecular ecology resources》2017,17(4):642-655

Characterization of highly duplicated genes, such as genes of the major histocompatibility complex (MHC), where multiple loci often co‐amplify, has until recently been hindered by insufficient read depths per amplicon. Here, we used ultra‐deep Illumina sequencing to resolve genotypes at exon 3 of MHC class I genes in the sedge warbler (Acrocephalus schoenobaenus). We sequenced 24 individuals in two replicates and used this data, as well as a simulated data set, to test the effect of amplicon coverage (range: 500–20 000 reads per amplicon) on the repeatability of genotyping using four different genotyping approaches. A third replicate employed unique barcoding to assess the extent of tag jumping, that is swapping of individual tag identifiers, which may confound genotyping. The reliability of MHC genotyping increased with coverage and approached or exceeded 90% within‐method repeatability of allele calling at coverages of >5000 reads per amplicon. We found generally high agreement between genotyping methods, especially at high coverages. High reliability of the tested genotyping approaches was further supported by our analysis of the simulated data set, although the genotyping approach relying primarily on replication of variants in independent amplicons proved sensitive to repeatable errors. According to the most repeatable genotyping method, the number of co‐amplifying variants per individual ranged from 19 to 42. Tag jumping was detectable, but at such low frequencies that it did not affect the reliability of genotyping. We thus demonstrate that gene families with many co‐amplifying genes can be reliably genotyped using HTS, provided that there is sufficient per amplicon coverage. 相似文献

3.

Substantial differences in bias between single‐digest and double‐digest RAD‐seq libraries: A case study

下载免费PDF全文

Sarah P. Flanagan Adam G. Jones 《Molecular ecology resources》2018,18(2):264-280

The trade‐offs of using single‐digest vs. double‐digest restriction site‐associated DNA sequencing (RAD‐seq) protocols have been widely discussed. However, no direct empirical comparisons of the two methods have been conducted. Here, we sampled a single population of Gulf pipefish (Syngnathus scovelli) and genotyped 444 individuals using RAD‐seq. Sixty individuals were subjected to single‐digest RAD‐seq (sdRAD‐seq), and the remaining 384 individuals were genotyped using a double‐digest RAD‐seq (ddRAD‐seq) protocol. We analysed the resulting Illumina sequencing data and compared the two genotyping methods when reads were analysed either together or separately. Coverage statistics, observed heterozygosity, and allele frequencies differed significantly between the two protocols, as did the results of selection components analysis. We also performed an in silico digestion of the Gulf pipefish genome and modelled five major sources of bias: PCR duplicates, polymorphic restriction sites, shearing bias, asymmetric sampling (i.e., genotyping fewer individuals with sdRAD‐seq than with ddRAD‐seq) and higher major allele frequencies. This combination of approaches allowed us to determine that polymorphic restriction sites, an asymmetric sampling scheme, mean allele frequencies and to some extent PCR duplicates all contribute to different estimates of allele frequencies between samples genotyped using sdRAD‐seq versus ddRAD‐seq. Our finding that sdRAD‐seq and ddRAD‐seq can result in different allele frequencies has implications for comparisons across studies and techniques that endeavour to identify genomewide signatures of evolutionary processes in natural populations. 相似文献

4.

Differentiating diploid and triploid individuals using single nucleotide polymorphisms genotyped by amplicon sequencing

Thomas A. Delomas 《Molecular ecology resources》2019,19(6):1545-1551

Triploidy can occur naturally or be induced in fish and shellfish during artificial propagation in order to produce sterile individuals. Fisheries managers often stock these sterile triploids as a means of improving angling opportunities without risking unwanted reproduction of the stocked fish. Additionally, the rearing of all‐triploid individuals has been suggested as a means to reduce the possibility of escaped aquaculture fish interbreeding with wild populations. Efficient means of determining if an individual is triploid or diploid are therefore needed both to monitor the efficacy of triploidy‐inducing treatments and, when sampling fish from a body of water that has a mixture of diploids and triploids, to determine the ploidy of a fish prior to further analyses. Currently, ploidy is regularly measured through flow cytometry, but this technique typically utilizes a fresh blood sample. This study presents an alternative, cost‐effective method of determining ploidy by analysing amplicon‐sequencing data for biallelic single‐nucleotide polymorphisms (SNPs). For each sample, heterozygous genotypes are identified and the likelihoods of diploidy and triploidy are calculated based on the read counts for each allele. The accuracy of this method is demonstrated using triploid and diploid brook trout (Salvelinus fontinalis) genotyped with a panel of 234 SNPs and Chinook salmon (Oncorhynchus tshawytscha) genotyped with a panel of 298 SNPs following the GT‐seq methodology of amplicon sequencing. 相似文献

5.

SNP discovery and genotyping using restriction‐site‐associated DNA sequencing in chickens

下载免费PDF全文

Zhengxiao Zhai Wenjing Zhao Chuan He Kaixuan Yang Linlin Tang Shuyun Liu Yan Zhang Qizhong Huang He Meng 《Animal genetics》2015,46(2):216-219

Single nucleotide polymorphisms (SNPs) are essential to the understanding of population genetic variation and diversity. Here, we performed restriction‐site‐associated DNA sequencing (RAD‐seq) on 72 individuals from 13 Chinese indigenous and three introduced chicken breeds. A total of 620 million reads were obtained using an Illumina Hiseq2000 sequencer. An average of 75 587 SNPs were identified from each individual. Further filtering strictly validated 28 895 SNPs candidates for all populations. When compared with the NCBI dbSNP (chicken_9031), 15 404 SNPs were new discoveries. In this study, RAD‐seq was performed for the first time on chickens, implicating the remarkable effectiveness and potential applications on genetic analysis and breeding technique for whole‐genome selection in chicken and other agricultural animals. 相似文献

6.

Mass production of SNP markers in a nonmodel passerine bird through RAD sequencing and contig mapping to the zebra finch genome

Yann X. C. Bourgeois Emeline Lhuillier Timothée Cézard Joris A. M. Bertrand Boris Delahaie Josselin Cornuault Thomas Duval Olivier Bouchez Borja Milá Christophe Thébaud 《Molecular ecology resources》2013,13(5):899-907

Here, we present an adaptation of restriction‐site‐associated DNA sequencing (RAD‐seq) to the Illumina HiSeq2000 technology that we used to produce SNP markers in very large quantities at low cost per unit in the Réunion grey white‐eye (Zosterops borbonicus), a nonmodel passerine bird species with no reference genome. We sequenced a set of six pools of 18–25 individuals using a single sequencing lane. This allowed us to build around 600 000 contigs, among which at least 386 000 could be mapped to the zebra finch (Taeniopygia guttata) genome. This yielded more than 80 000 SNPs that could be mapped unambiguously and are evenly distributed across the genome. Thus, our approach provides a good illustration of the high potential of paired‐end RAD sequencing of pooled DNA samples combined with comparative assembly to the zebra finch genome to build large contigs and characterize vast numbers of informative SNPs in nonmodel passerine bird species in a very efficient and cost‐effective way. 相似文献

7.

Random PCR‐based genotyping by sequencing technology GRAS‐Di (genotyping by random amplicon sequencing,direct) reveals genetic structure of mangrove fishes

Sho Hosoya Shotaro Hirase Kiyoshi Kikuchi Kusuto Nanjo Yohei Nakamura Hiroyoshi Kohno Mitsuhiko Sano 《Molecular ecology resources》2019,19(5):1153-1163

While various technologies for high‐throughput genotyping have been developed for ecological studies, simple methods tolerant to low‐quality DNA samples are still limited. In this study, we tested the availability of a random PCR‐based genotyping‐by‐sequencing technology, genotyping by random amplicon sequencing, direct (GRAS‐Di). We focused on population genetic analysis of estuarine mangrove fishes, including two resident species, the Amboina cardinalfish (Fibramia amboinensis, Bleeker, 1853) and the Duncker's river garfish (Zenarchopterus dunckeri, Mohr, 1926), and a marine migrant, the blacktail snapper (Lutjanus fulvus, Forster, 1801). Collections were from the Ryukyu Islands, southern Japan. PCR amplicons derived from ~130 individuals were pooled and sequenced in a single lane on a HiSeq2500 platform, and an average of three million reads was obtained per individual. Consensus contigs were assembled for each species and used for genotyping of single nucleotide polymorphisms by mapping trimmed reads onto the contigs. After quality filtering steps, 4,000–9,000 putative single nucleotide polymorphisms were detected for each species. Although DNA fragmentation can diminish genotyping performance when analysed on next‐generation sequencing technology, the effect was small. Genetic differentiation and a clear pattern of isolation‐by‐distance was observed in F. amboinensis and Z. dunckeri by means of principal component analysis, F_ST and the admixture analysis. By contrast, L. fulvus comprised a genetically homogeneous population with directional recent gene flow. These genetic differentiation patterns reflect patterns of estuary use through life history. These results showed the power of GRAS‐Di for fine‐grained genetic analysis using field samples, including mangrove fishes. 相似文献

8.

Genotyping‐in‐Thousands by sequencing reveals marked population structure in Western Rattlesnakes to inform conservation status

Danielle A. Schmidt Purnima Govindarajulu Karl W. Larsen Michael A. Russello 《Ecology and evolution》2020,10(14):7157-7172

Delineation of units below the species level is critical for prioritizing conservation actions for species at‐risk. Genetic studies play an important role in characterizing patterns of population connectivity and diversity to inform the designation of conservation units, especially for populations that are geographically isolated. The northernmost range margin of Western Rattlesnakes (Crotalus oreganus) occurs in British Columbia, Canada, where it is federally classified as threatened and restricted to five geographic regions. In these areas, Western Rattlesnakes hibernate (den) communally, raising questions about connectivity within and between den complexes. At present, Western Rattlesnake conservation efforts are hindered by a complete lack of information on genetic structure and degree of isolation at multiple scales, from the den to the regional level. To fill this knowledge gap, we used Genotyping‐in‐Thousands by sequencing (GT‐seq) to genotype an optimized panel of 362 single nucleotide polymorphisms (SNPs) from individual samples (n = 461) collected across the snake's distribution in western Canada and neighboring Washington (USA). Hierarchical STRUCTURE analyses found evidence for population structure within and among the five geographic regions in BC, as well as in Washington. Within these regions, 11 genetically distinct complexes of dens were identified, with some regions having multiple complexes. No significant pattern of isolation‐by‐distance and generally low levels of migration were detected among den complexes across regions. Additionally, snakes within dens generally were more related than those among den complexes within a region, indicating limited movement. Overall, our results suggest that the single, recognized designatable unit for Western Rattlesnakes in Canada should be re‐assessed to proactively focus conservation efforts on preserving total genetic variation detected range‐wide. More broadly, our study demonstrates a novel application of GT‐seq for investigating patterns of diversity in wild populations at multiple scales to better inform conservation management. 相似文献

9.

Multiplex sequencing of bacterial artificial chromosomes for assembling complex plant genomes

下载免费PDF全文

Sebastian Beier Axel Himmelbach Thomas Schmutzer Marius Felder Stefan Taudien Klaus F. X. Mayer Matthias Platzer Nils Stein Uwe Scholz Martin Mascher 《Plant biotechnology journal》2016,14(7):1511-1522

Hierarchical shotgun sequencing remains the method of choice for assembling high‐quality reference sequences of complex plant genomes. The efficient exploitation of current high‐throughput technologies and powerful computational facilities for large‐insert clone sequencing necessitates the sequencing and assembly of a large number of clones in parallel. We developed a multiplexed pipeline for shotgun sequencing and assembling individual bacterial artificial chromosomes (BACs) using the Illumina sequencing platform. We illustrate our approach by sequencing 668 barley BACs (Hordeum vulgare L.) in a single Illumina HiSeq 2000 lane. Using a newly designed parallelized computational pipeline, we obtained sequence assemblies of individual BACs that consist, on average, of eight sequence scaffolds and represent >98% of the genomic inserts. Our BAC assemblies are clearly superior to a whole‐genome shotgun assembly regarding contiguity, completeness and the representation of the gene space. Our methods may be employed to rapidly obtain high‐quality assemblies of a large number of clones to assemble map‐based reference sequences of plant and animal species with complex genomes by sequencing along a minimum tiling path. 相似文献

10.

amplisas: a web server for multilocus genotyping using next‐generation amplicon sequencing data

下载免费PDF全文

Alvaro Sebastian Magdalena Herdegen Magdalena Migalska Jacek Radwan 《Molecular ecology resources》2016,16(2):498-510

Next‐generation sequencing (NGS) technologies are revolutionizing the fields of biology and medicine as powerful tools for amplicon sequencing (AS). Using combinations of primers and barcodes, it is possible to sequence targeted genomic regions with deep coverage for hundreds, even thousands, of individuals in a single experiment. This is extremely valuable for the genotyping of gene families in which locus‐specific primers are often difficult to design, such as the major histocompatibility complex (MHC). The utility of AS is, however, limited by the high intrinsic sequencing error rates of NGS technologies and other sources of error such as polymerase amplification or chimera formation. Correcting these errors requires extensive bioinformatic post‐processing of NGS data. Amplicon Sequence Assignment (amplisas ) is a tool that performs analysis of AS results in a simple and efficient way, while offering customization options for advanced users. amplisas is designed as a three‐step pipeline consisting of (i) read demultiplexing, (ii) unique sequence clustering and (iii) erroneous sequence filtering. Allele sequences and frequencies are retrieved in excel spreadsheet format, making them easy to interpret. amplisas performance has been successfully benchmarked against previously published genotyped MHC data sets obtained with various NGS technologies. 相似文献

11.

Parallel tagged amplicon sequencing of relatively long PCR products using the Illumina HiSeq platform and transcriptome assembly

Yan‐Jie Feng Qing‐Feng Liu Meng‐Yun Chen Dan Liang Peng Zhang 《Molecular ecology resources》2016,16(1):91-102

相似文献

12.

Targeted sequencing of the human X chromosome exome

Mondal K Shetty AC Patel V Cutler DJ Zwick ME 《Genomics》2011,98(4):260-265

We used a RainDance Technologies (RDT) expanded content library to enrich the human X chromosome exome (2.5 Mb) from 26 male samples followed by Illumina sequencing. Our multiplex primer library covered 98.05% of the human X chromosome exome in a single tube with 11,845 different PCR amplicons. Illumina sequencing of 24 male samples showed coverage for 97% of the targeted sequences. Sequence from 2 HapMap samples confirmed missing data rates of 2–3% at sites successfully typed by the HapMap project, with an accuracy of at least ~ 99.5% as compared to reported HapMap genotypes. Our demonstration that a RDT expanded content library can efficiently enrich and enable the routine sequencing of the human X chromosome exome suggests a wide variety of potential research and clinical applications for this platform. 相似文献

13.

Ultra‐deep Illumina sequencing accurately identifies MHC class IIb alleles and provides evidence for copy number variation in the guppy (Poecilia reticulata)

Jackie Lighten Cock van Oosterhout Ian G. Paterson Mark McMullan Paul Bentzen 《Molecular ecology resources》2014,14(4):753-767

We address the bioinformatic issue of accurately separating amplified genes of the major histocompatibility complex (MHC) from artefacts generated during high‐throughput sequencing workflows. We fit observed ultra‐deep sequencing depths (hundreds to thousands of sequences per amplicon) of allelic variants to expectations from genetic models of copy number variation (CNV). We provide a simple, accurate and repeatable method for genotyping multigene families, evaluating our method via analyses of 209 b of MHC class IIb exon 2 in guppies (Poecilia reticulata). Genotype repeatability for resequenced individuals (N = 49) was high (100%) within the same sequencing run. However, repeatability dropped to 83.7% between independent runs, either because of lower mean amplicon sequencing depth in the initial run or random PCR effects. This highlights the importance of fully independent replicates. Significant improvements in genotyping accuracy were made by greatly reducing type I genotyping error (i.e. accepting an artefact as a true allele), which may occur when using low‐depth allele validation thresholds used by previous methods. Only a small amount (4.9%) of type II error (i.e. rejecting a genuine allele as an artefact) was detected through fully independent sequencing runs. We observed 1–6 alleles per individual, and evidence of sharing of alleles across loci. Variation in the total number of MHC class II loci among individuals, both among and within populations was also observed, and some genotypes appeared to be partially hemizygous; total allelic dosage added up to an odd number of allelic copies. Collectively, observations provide evidence of MHC CNV and its complex basis in natural populations. 相似文献

14.

Design of a 9K illumina BeadChip for polar bears (Ursus maritimus) from RAD and transcriptome sequencing

下载免费PDF全文

René M. Malenfant David W. Coltman Corey S. Davis 《Molecular ecology resources》2015,15(3):587-600

相似文献

15.

Targeted multiplex next‐generation sequencing: advances in techniques of mitochondrial and nuclear DNA sequencing for population genomics

Brittany L. Hancock‐Hanser Amy Frey Matthew S. Leslie Peter H. Dutton Frederick I. Archer Phillip A. Morin 《Molecular ecology resources》2013,13(2):254-268

Next‐generation sequencing (NGS) is emerging as an efficient and cost‐effective tool in population genomic analyses of nonmodel organisms, allowing simultaneous resequencing of many regions of multi‐genomic DNA from multiplexed samples. Here, we detail our synthesis of protocols for targeted resequencing of mitochondrial and nuclear loci by generating indexed genomic libraries for multiplexing up to 100 individuals in a single sequencing pool, and then enriching the pooled library using custom DNA capture arrays. Our use of DNA sequence from one species to capture and enrich the sequencing libraries of another species (i.e. cross‐species DNA capture) indicates that efficient enrichment occurs when sequences are up to about 12% divergent, allowing us to take advantage of genomic information in one species to sequence orthologous regions in related species. In addition to a complete mitochondrial genome on each array, we have included between 43 and 118 nuclear loci for low‐coverage sequencing of between 18 kb and 87 kb of DNA sequence per individual for single nucleotide polymorphisms discovery from 50 to 100 individuals in a single sequencing lane. Using this method, we have generated a total of over 500 whole mitochondrial genomes from seven cetacean species and green sea turtles. The greater variation detected in mitogenomes relative to short mtDNA sequences is helping to resolve genetic structure ranging from geographic to species‐level differences. These NGS and analysis techniques have allowed for simultaneous population genomic studies of mtDNA and nDNA with greater genomic coverage and phylogeographic resolution than has previously been possible in marine mammals and turtles. 相似文献

16.

Next‐generation DNA barcoding: using next‐generation sequencing to enhance and accelerate DNA barcode capture from single specimens

Shadi Shokralla Joel F. Gibson Hamid Nikbakht Daniel H. Janzen Winnie Hallwachs Mehrdad Hajibabaei 《Molecular ecology resources》2014,14(5):892-901

DNA barcoding is an efficient method to identify specimens and to detect undescribed/cryptic species. Sanger sequencing of individual specimens is the standard approach in generating large‐scale DNA barcode libraries and identifying unknowns. However, the Sanger sequencing technology is, in some respects, inferior to next‐generation sequencers, which are capable of producing millions of sequence reads simultaneously. Additionally, direct Sanger sequencing of DNA barcode amplicons, as practiced in most DNA barcoding procedures, is hampered by the need for relatively high‐target amplicon yield, coamplification of nuclear mitochondrial pseudogenes, confusion with sequences from intracellular endosymbiotic bacteria (e.g. Wolbachia) and instances of intraindividual variability (i.e. heteroplasmy). Any of these situations can lead to failed Sanger sequencing attempts or ambiguity of the generated DNA barcodes. Here, we demonstrate the potential application of next‐generation sequencing platforms for parallel acquisition of DNA barcode sequences from hundreds of specimens simultaneously. To facilitate retrieval of sequences obtained from individual specimens, we tag individual specimens during PCR amplification using unique 10‐mer oligonucleotides attached to DNA barcoding PCR primers. We employ 454 pyrosequencing to recover full‐length DNA barcodes of 190 specimens using 12.5% capacity of a 454 sequencing run (i.e. two lanes of a 16 lane run). We obtained an average of 143 sequence reads for each individual specimen. The sequences produced are full‐length DNA barcodes for all but one of the included specimens. In a subset of samples, we also detected Wolbachia, nontarget species, and heteroplasmic sequences. Next‐generation sequencing is of great value because of its protocol simplicity, greatly reduced cost per barcode read, faster throughout and added information content. 相似文献

17.

Double‐digest RAD sequencing using Ion Proton semiconductor platform (ddRADseq‐ion) with nonmodel organisms

Hans Recknagel Arne Jacobs Pawel Herzyk Kathryn R. Elmer 《Molecular ecology resources》2015,15(6):1316-1329

Research in evolutionary biology involving nonmodel organisms is rapidly shifting from using traditional molecular markers such as mtDNA and microsatellites to higher throughput SNP genotyping methodologies to address questions in population genetics, phylogenetics and genetic mapping. Restriction site associated DNA sequencing (RAD sequencing or RADseq) has become an established method for SNP genotyping on Illumina sequencing platforms. Here, we developed a protocol and adapters for double‐digest RAD sequencing for Ion Torrent (Life Technologies; Ion Proton, Ion PGM) semiconductor sequencing. We sequenced thirteen genomic libraries of three different nonmodel vertebrate species on Ion Proton with PI chips: Arctic charr Salvelinus alpinus, European whitefish Coregonus lavaretus and common lizard Zootoca vivipara. This resulted in ~962 million single‐end reads overall and a mean of ~74 million reads per library. We filtered the genomic data using Stacks, a bioinformatic tool to process RAD sequencing data. On average, we obtained ~11 000 polymorphic loci per library of 6–30 individuals. We validate our new method by technical and biological replication, by reconstructing phylogenetic relationships, and using a hybrid genetic cross to track genomic variants. Finally, we discuss the differences between using the different sequencing platforms in the context of RAD sequencing, assessing possible advantages and disadvantages. We show that our protocol can be used for Ion semiconductor sequencing platforms for the rapid and cost‐effective generation of variable and reproducible genetic markers. 相似文献

18.

Optimization of the genotyping‐by‐sequencing strategy for population genomic analysis in conifers

下载免费PDF全文

Jin Pan Baosheng Wang Zhi‐Yong Pei Wei Zhao Jie Gao Jian‐Feng Mao Xiao‐Ru Wang 《Molecular ecology resources》2015,15(4):711-722

Flexibility and low cost make genotyping‐by‐sequencing (GBS) an ideal tool for population genomic studies of nonmodel species. However, to utilize the potential of the method fully, many parameters affecting library quality and single nucleotide polymorphism (SNP) discovery require optimization, especially for conifer genomes with a high repetitive DNA content. In this study, we explored strategies for effective GBS analysis in pine species. We constructed GBS libraries using HpaII, PstI and EcoRI‐MseI digestions with different multiplexing levels and examined the effect of restriction enzymes on library complexity and the impact of sequencing depth and size selection of restriction fragments on sequence coverage bias. We tested and compared UNEAK, Stacks and GATK pipelines for the GBS data, and then developed a reference‐free SNP calling strategy for haploid pine genomes. Our GBS procedure proved to be effective in SNP discovery, producing 7000–11 000 and 14 751 SNPs within and among three pine species, respectively, from a PstI library. This investigation provides guidance for the design and analysis of GBS experiments, particularly for organisms for which genomic information is lacking. 相似文献

19.

Finding the right coverage: the impact of coverage and sequence quality on single nucleotide polymorphism genotyping error rates

下载免费PDF全文

Emily D. Fountain Jonathan N. Pauli Brendan N. Reid Per J. Palsbøll M. Zachariah Peery 《Molecular ecology resources》2016,16(4):966-978

Restriction‐enzyme‐based sequencing methods enable the genotyping of thousands of single nucleotide polymorphism (SNP) loci in nonmodel organisms. However, in contrast to traditional genetic markers, genotyping error rates in SNPs derived from restriction‐enzyme‐based methods remain largely unknown. Here, we estimated genotyping error rates in SNPs genotyped with double digest RAD sequencing from Mendelian incompatibilities in known mother–offspring dyads of Hoffman's two‐toed sloth (Choloepus hoffmanni) across a range of coverage and sequence quality criteria, for both reference‐aligned and de novo‐assembled data sets. Genotyping error rates were more sensitive to coverage than sequence quality and low coverage yielded high error rates, particularly in de novo‐assembled data sets. For example, coverage ≥5 yielded median genotyping error rates of ≥0.03 and ≥0.11 in reference‐aligned and de novo‐assembled data sets, respectively. Genotyping error rates declined to ≤0.01 in reference‐aligned data sets with a coverage ≥30, but remained ≥0.04 in the de novo‐assembled data sets. We observed approximately 10‐ and 13‐fold declines in the number of loci sampled in the reference‐aligned and de novo‐assembled data sets when coverage was increased from ≥5 to ≥30 at quality score ≥30, respectively. Finally, we assessed the effects of genotyping coverage on a common population genetic application, parentage assignments, and showed that the proportion of incorrectly assigned maternities was relatively high at low coverage. Overall, our results suggest that the trade‐off between sample size and genotyping error rates be considered prior to building sequencing libraries, reporting genotyping error rates become standard practice, and that effects of genotyping errors on inference be evaluated in restriction‐enzyme‐based SNP studies. 相似文献

20.

Development of SNP‐genotyping arrays in two shellfish species

S. Lapègue E. Harrang S. Heurtebise E. Flahauw C. Donnadieu P. Gayral M. Ballenghien L. Genestout L. Barbotte R. Mahla P. Haffray C. Klopp 《Molecular ecology resources》2014,14(4):820-830

Use of SNPs has been favoured due to their abundance in plant and animal genomes, accompanied by the falling cost and rising throughput capacity for detection and genotyping. Here, we present in vitro (obtained from targeted sequencing) and in silico discovery of SNPs, and the design of medium‐throughput genotyping arrays for two oyster species, the Pacific oyster, Crassostrea gigas, and European flat oyster, Ostrea edulis. Two sets of 384 SNP markers were designed for two Illumina GoldenGate arrays and genotyped on more than 1000 samples for each species. In each case, oyster samples were obtained from wild and selected populations and from three‐generation families segregating for traits of interest in aquaculture. The rate of successfully genotyped polymorphic SNPs was about 60% for each species. Effects of SNP origin and quality on genotyping success (Illumina functionality Score) were analysed and compared with other model and nonmodel species. Furthermore, a simulation was made based on a subset of the C. gigas SNP array with a minor allele frequency of 0.3 and typical crosses used in shellfish hatcheries. This simulation indicated that at least 150 markers were needed to perform an accurate parental assignment. Such panels might provide valuable tools to improve our understanding of the connectivity between wild (and selected) populations and could contribute to future selective breeding programmes. 相似文献