首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We are writing in response to the population and phylogenomics meeting review by Andrews & Luikart ( 2014 ) entitled ‘Recent novel approaches for population genomics data analysis’. Restriction‐site‐associated DNA (RAD) sequencing has become a powerful and useful approach in molecular ecology, with several different published methods now available to molecular ecologists, none of which can be considered the best option in all situations. A&L report that the original RAD protocol of Miller et al. ( 2007 ) and Baird et al. ( 2008 ) is superior to all other RAD variants because putative PCR duplicates can be identified (see Baxter et al. 2011 ), thereby reducing the impact of PCR artefacts on allele frequency estimates (Andrews & Luikart 2014 ). In response, we (i) challenge the assertion that the original RAD protocol minimizes the impact of PCR artefacts relative to that of other RAD protocols, (ii) present additional biases in RADseq that are at least as important as PCR artefacts in selecting a RAD protocol and (iii) highlight the strengths and weaknesses of four different approaches to RADseq which are a representative sample of all RAD variants: the original RAD protocol (mbRAD, Miller et al. 2007 ; Baird et al. 2008 ), double digest RAD (ddRAD, Peterson et al. 2012 ), ezRAD (Toonen et al. 2013 ) and 2bRAD (Wang et al. 2012 ). With an understanding of the strengths and weaknesses of different RAD protocols, researchers can make a more informed decision when selecting a RAD protocol.  相似文献   

2.
The KwaZulu‐Natal yellowfish (Labeobarbus natalensis) is an abundant cyprinid, endemic to KwaZulu‐Natal Province, South Africa. In this study, we developed a single‐nucleotide polymorphism (SNP) dataset from double‐digest restriction site‐associated DNA (ddRAD) sequencing of samples across the distribution. We addressed several hidden challenges, primarily focusing on proper filtering of RAD data and selecting optimal parameters for data processing in polyploid lineages. We used the resulting high‐quality SNP dataset to investigate the population genetic structure of L. natalensis. A small number of mitochondrial markers present in these data had disproportionate influence on the recovered genetic structure. The presence of singleton SNPs also confounded genetic structure. We found a well‐supported division into northern and southern lineages, with further subdivision into five populations, one of which reflects north–south admixture. Approximate Bayesian Computation scenario testing supported a scenario where an ancestral population diverged into northern and southern lineages, which then diverged to yield the current five populations. All river systems showed similar levels of genetic diversity, which appears unrelated to drainage system size. Nucleotide diversity was highest in the smallest river system, the Mbokodweni, which, together with adjacent small coastal systems, should be considered as a key catchment for conservation.  相似文献   

3.
Information on genetic relationships among individuals is essential to many studies of the behaviour and ecology of wild organisms. Parentage and relatedness assays based on large numbers of single nucleotide polymorphism (SNP) loci hold substantial advantages over the microsatellite markers traditionally used for these purposes. We present a double‐digest restriction site‐associated DNA sequencing (ddRAD‐seq) analysis pipeline that, as such, simultaneously achieves the SNP discovery and genotyping steps and which is optimized to return a statistically powerful set of SNP markers (typically 150–600 after stringent filtering) from large numbers of individuals (up to 240 per run). We explore the trade‐offs inherent in this approach through a set of experiments in a species with a complex social system, the variegated fairy‐wren (Malurus lamberti) and further validate it in a phylogenetically broad set of other bird species. Through direct comparisons with a parallel data set from a robust panel of highly variable microsatellite markers, we show that this ddRAD‐seq approach results in substantially improved power to discriminate among potential relatives and considerably more precise estimates of relatedness coefficients. The pipeline is designed to be universally applicable to all bird species (and with minor modifications to many other taxa), to be cost‐ and time‐efficient, and to be replicable across independent runs such that genotype data from different study periods can be combined and analysed as field samples are accumulated.  相似文献   

4.
DNA microarray and next-generation DNA sequencing technologies are important tools for high-throughput genome research, in revealing both the structural and functional characteristics of genomes. In the past decade the DNA microarray technologies have been widely applied in the studies of functional genomics, systems biology and pharmacogenomics. The next-generation DNA sequencing method was first introduced by the 454 Company in 2003, immediately followed by the establishment of the Solexa and Solid techniques by other biotech companies. Though it has not been long since the first emergence of this technology, with the fast and impressive improvement, the application of this technology has extended to almost all fields of genomics research, as a rival challenging the existing DNA microarray technology. This paper briefly reviews the working principles of these two technologies as well as their application and perspectives in genome research. Supported by the National High-Tech Research Program of China (Grant No.2006AA020704) and Shanghai Science and Technology Commission (Grant No. 05DZ22201)  相似文献   

5.
Genomic studies of invasive species can reveal both invasive pathways and functional differences underpinning patterns of colonization success. The European green crab (Carcinus maenas) was initially introduced to eastern North America nearly 200 years ago where it expanded northwards to eastern Nova Scotia. A subsequent invasion to Nova Scotia from a northern European source allowed further range expansion, providing a unique opportunity to study the invasion genomics of a species with multiple invasions. Here, we use restriction‐site‐associated DNA sequencing‐derived SNPs to explore fine‐scale genomewide differentiation between these two invasions. We identified 9137 loci from green crab sampled from 11 locations along eastern North America and compared spatial variation to mitochondrial COI sequence variation used previously to characterize these invasions. Overall spatial divergence among invasions was high (pairwise FST ~0.001 to 0.15) and spread across many loci, with a mean FST ~0.052 and 52% of loci examined characterized by FST values >0.05. The majority of the most divergent loci (i.e., outliers, ~1.2%) displayed latitudinal clines in allele frequency highlighting extensive genomic divergence among the invasions. Discriminant analysis of principal components (both neutral and outlier loci) clearly resolved the two invasions spatially and was highly correlated with mitochondrial divergence. Our results reveal extensive cryptic intraspecific genomic diversity associated with differing patterns of colonization success and demonstrates clear utility for genomic approaches to delineating the distribution and colonization success of aquatic invasive species.  相似文献   

6.
The large yellow croaker, Larimichthys crocea, is a commercially important drum fish (Family: Sciaenidae) native to the East and South China Sea. Habitat deterioration and overfishing have led to significant population decline and the collapse of its fishery over the past decades. Today, the market supply of L. crocea depends solely on stocks produced in hatcheries and farms. Common issues that occur in the culture of L. crocea include germplasm degradation, precocious puberty, elevated disease susceptibility and growth retardation. In this study, we employed SLAF‐seq (specific‐locus amplified fragment sequencing) technology to identify single nucleotide polymorphism (SNP) loci across the L. crocea genome. Sixty samples were selected for SLAF analysis out of 1000 progeny in the same cohort of a cultured stock. Our analysis obtained a total of 151 253 SLAFs, of which 65.88% (99 652) were identified to be polymorphic, scoring a total of 710 567 putative SNPs. Further filtration resulted in a final panel of 1782 SNP loci. The data derived from this work could be beneficial for understanding the genetics of complex phenotypic traits as well as for developing marker‐selection‐assisted breeding programs in L. crocea.  相似文献   

7.
The application of next-generation sequencing (NGS) technologies for the development of simple sequence repeat (SSR) or microsatellite loci for genetic research in the botanical sciences is described. Microsatellite markers are one of the most informative and versatile DNA-based markers used in plant genetic research, but their development has traditionally been a difficult and costly process. NGS technologies allow the efficient identification of large numbers of microsatellites at a fraction of the cost and effort of traditional approaches. The major advantage of NGS methods is their ability to produce large amounts of sequence data from which to isolate and develop numerous genome-wide and gene-based microsatellite loci. The two major NGS technologies with emergent application in SSR isolation are 454 and Illumina. A review is provided of several recent studies demonstrating the efficient use of 454 and Illumina technologies for the discovery of microsatellites in plants. Additionally, important aspects during NGS isolation and development of microsatellites are discussed, including the use of computational tools and high-throughput genotyping methods. A data set of microsatellite loci in the plastome and mitochondriome of cranberry (Vaccinium macrocarpon Ait.) is provided to illustrate a successful application of 454 sequencing for SSR discovery. In the future, NGS technologies will massively increase the number of SSRs and other genetic markers available to conduct genetic research in understudied but economically important crops such as cranberry.  相似文献   

8.
Decreasing sequencing costs have driven a rapid expansion of novel genotyping methods. One of these methods is the exploitation of restriction enzyme cut sites to generate genome‐wide but reduced representation sequencing libraries (RRLs), alternatively termed genotyping by sequencing or restriction‐site associated DNA sequencing. Without a reference genome, the resulting short sequence reads must be assembled de novo. There are many possible assembly programs, most not explicitly developed for RRL data, and we know little of their effectiveness. In this issue of Molecular Ecology Resources, LaCava et al. (2020) systematically evaluate six commonly used programs and two commonly varied parameters for complete and accurate assembly of RRLs, using simulated double digests of Homo sapiens and Arabidopsis thaliana genomes with varied mutation rates and types. The authors find substantial variation in performance across assembly programs. The most consistently high‐performing assembler is infrequently used in their literature survey (CD‐HIT; Li and Godzik, 2006), while several others fail to produce complete, accurate assemblies under many conditions. LaCava et al. additionally recommend best practices in parameter choice and evaluation of future assembly programs—advice that molecular ecologists working to assemble sequences of all kinds should take to heart.  相似文献   

9.
There has been remarkably little attention to using the high resolution provided by genotyping‐by‐sequencing (i.e., RADseq and similar methods) for assessing relatedness in wildlife populations. A major hurdle is the genotyping error, especially allelic dropout, often found in this type of data that could lead to downward‐biased, yet precise, estimates of relatedness. Here, we assess the applicability of genotyping‐by‐sequencing for relatedness inferences given its relatively high genotyping error rate. Individuals of known relatedness were simulated under genotyping error, allelic dropout and missing data scenarios based on an empirical ddRAD data set, and their true relatedness was compared to that estimated by seven relatedness estimators. We found that an estimator chosen through such analyses can circumvent the influence of genotyping error, with the estimator of Ritland (Genetics Research, 67, 175) shown to be unaffected by allelic dropout and to be the most accurate when there is genotyping error. We also found that the choice of estimator should not rely solely on the strength of correlation between estimated and true relatedness as a strong correlation does not necessarily mean estimates are close to true relatedness. We also demonstrated how even a large SNP data set with genotyping error (allelic dropout or otherwise) or missing data still performs better than a perfectly genotyped microsatellite data set of tens of markers. The simulation‐based approach used here can be easily implemented by others on their own genotyping‐by‐sequencing data sets to confirm the most appropriate and powerful estimator for their data.  相似文献   

10.
Research in evolutionary biology involving nonmodel organisms is rapidly shifting from using traditional molecular markers such as mtDNA and microsatellites to higher throughput SNP genotyping methodologies to address questions in population genetics, phylogenetics and genetic mapping. Restriction site associated DNA sequencing (RAD sequencing or RADseq) has become an established method for SNP genotyping on Illumina sequencing platforms. Here, we developed a protocol and adapters for double‐digest RAD sequencing for Ion Torrent (Life Technologies; Ion Proton, Ion PGM) semiconductor sequencing. We sequenced thirteen genomic libraries of three different nonmodel vertebrate species on Ion Proton with PI chips: Arctic charr Salvelinus alpinus, European whitefish Coregonus lavaretus and common lizard Zootoca vivipara. This resulted in ~962 million single‐end reads overall and a mean of ~74 million reads per library. We filtered the genomic data using Stacks, a bioinformatic tool to process RAD sequencing data. On average, we obtained ~11 000 polymorphic loci per library of 6–30 individuals. We validate our new method by technical and biological replication, by reconstructing phylogenetic relationships, and using a hybrid genetic cross to track genomic variants. Finally, we discuss the differences between using the different sequencing platforms in the context of RAD sequencing, assessing possible advantages and disadvantages. We show that our protocol can be used for Ion semiconductor sequencing platforms for the rapid and cost‐effective generation of variable and reproducible genetic markers.  相似文献   

11.
12.
Mitochondrial DNA (mtDNA) has formed the backbone of phylogeographic research for many years; however, recent trends focus on genome‐wide analyses. One method proposed for calibrating inferences from noisy next‐generation data, such as RAD sequencing, is to compare these results with analyses of mitochondrial sequences. Most researchers using this approach appear to be unaware that many single nucleotide polymorphisms (SNPs) identified from genome‐wide sequence data are themselves mitochondrial, or assume that these are too few to bias analyses. Here, we demonstrate two methods for mining mitochondrial markers using RAD sequence data from three South African species of yellowfish, Labeobarbus. First, we use a rigorous SNP discovery pipeline using the program stacks , to identify variant sites in mtDNA, which we then combine into haplotypes. Second, we directly map sequence reads against a mitochondrial genome reference. This method allowed us to reconstruct up to 98% of the Labeobarbus mitogenome. We validated these mitogenome reconstructions through blast database searches and by comparison with cytochrome b gene sequences obtained through Sanger sequencing. Finally, we investigate the organismal consequences of these data including ancient genetic exchange and a recent translocation among populations of L. natalensis, as well as interspecific hybridization between L. aeneus and L. kimberleyensis.  相似文献   

13.
For half a century population genetics studies have put type II restriction endonucleases to work. Now, coupled with massively‐parallel, short‐read sequencing, the family of RAD protocols that wields these enzymes has generated vast genetic knowledge from the natural world. Here, we describe the first software natively capable of using paired‐end sequencing to derive short contigs from de novo RAD data. Stacks version 2 employs a de Bruijn graph assembler to build and connect contigs from forward and reverse reads for each de novo RAD locus, which it then uses as a reference for read alignments. The new architecture allows all the individuals in a metapopulation to be considered at the same time as each RAD locus is processed. This enables a Bayesian genotype caller to provide precise SNPs, and a robust algorithm to phase those SNPs into long haplotypes, generating RAD loci that are 400–800 bp in length. To prove its recall and precision, we tested the software with simulated data and compared reference‐aligned and de novo analyses of three empirical data sets. Our study shows that the latest version of Stacks is highly accurate and outperforms other software in assembling and genotyping paired‐end de novo data sets.  相似文献   

14.
Salmonid genomes are considered to be in a pseudo‐tetraploid state as a result of a genome duplication event that occurred between 25 and 100 Ma. This situation complicates single‐nucleotide polymorphism (SNP) discovery in rainbow trout as many putative SNPs are actually paralogous sequence variants (PSVs) and not simple allelic variants. To differentiate PSVs from simple allelic variants, we used 19 homozygous doubled haploid (DH) lines that represent a wide geographical range of rainbow trout populations. In the first phase of the study, we analysed SbfI restriction‐site associated DNA (RAD) sequence data from all the 19 lines and selected 11 lines for an extended SNP discovery. In the second phase, we conducted the extended SNP discovery using PstI RAD sequence data from the selected 11 lines. The complete data set is composed of 145 168 high‐quality putative SNPs that were genotyped in at least nine of the 11 lines, of which 71 446 (49%) had minor allele frequencies (MAF) of at least 18% (i.e. at least two of the 11 lines). Approximately 14% of the RAD SNPs in this data set are from expressed or coding rainbow trout sequences. Our comparison of the current data set with previous SNP discovery data sets revealed that 99% of our SNPs are novel. In the support files for this resource, we provide annotation to the positions of the SNPs in the working draft of the rainbow trout reference genome, provide the genotypes of each sample in the discovery panel and identify SNPs that are likely to be in coding sequences.  相似文献   

15.
16.
Species delimitation is fundamental to conservation and sustainable use of economically important forest tree species. However, the delimitation of two highly valued gold‐thread nanmu species (Phoebe bournei (Hemsl.) Yang and P. zhennan S. K. Lee & F. N. Wei) has been confusing and debated. To address this problem, we integrated morphology and restriction site‐associated DNA sequencing (RAD‐seq) to define their species boundaries. We obtained consistent results from the two datasets, supporting two distinct lineages corresponding to P. bournei and P. zhennan. In P. bournei, higher order leaf venation is more prominent, petioles are thicker, and leaf apex angle is narrower, compared to P. zhennan. Both datasets also revealed that the former putative P. bournei populations from northeastern Guizhou belong to P. zhennan. The two species are now distinct in distributions except in the Wuling Mountains, where they overlap. Phoebe bournei occurs mainly in central Fujian, southern Jiangxi, the Nanling Mountains, and the Wuling Mountains, whereas P. zhennan is found in the adjoining eastern regions of the Qionglai Mountains, the southern Sichuan hills, and the Wuling Mountains. The re‐delimitation of P. bournei and P. zhennan and clarification of their ranges provide a better scientific basis guiding the conservation and sustainable utilization of these tree species.  相似文献   

17.
18.
Population increases over the past several decades provide natural settings in which to study the evolutionary processes that occur during bottleneck, growth, and spatial expansion. We used parallel natural experiments of historical decline and subsequent recovery in two sympatric pinniped species in the Northwest Atlantic, the gray seal (Halichoerus grypus atlantica) and harbor seal (Phoca vitulina vitulina), to study the impact of recent demographic change in genomic diversity. Using restriction site‐associated DNA sequencing, we assessed genomic diversity at over 8,700 polymorphic gray seal loci and 3,700 polymorphic harbor seal loci in samples from multiple cohorts collected throughout recovery over the past half‐century. Despite significant differences in the degree of genetic diversity assessed in the two species, we found signatures of historical bottlenecks in the contemporary genomes of both gray and harbor seals. We evaluated temporal trends in diversity across cohorts, as well as compared samples from sites at both the center and edge of a recent gray seal range expansion, but found no significant change in genomewide diversity following recovery. We did, however, find that the variance and degree of allele frequency change measured over the past several decades were significantly different from neutral expectations of drift under population growth. These two cases of well‐described demographic history provide opportunities for critical evaluation of current approaches to simulating and understanding the genetic effects of historical demographic change in natural populations.  相似文献   

19.
The advent of high‐throughput sequencing (HTS) has made genomic‐level analyses feasible for nonmodel organisms. A critical step of many HTS pipelines involves aligning reads to a reference genome to identify variants. Despite recent initiatives, only a fraction of species has publically available reference genomes. Therefore, a common practice is to align reads to the genome of an organism related to the target species; however, this could affect read alignment and bias genotyping. In this study, I conducted an experiment using empirical RADseq datasets generated for two species of salmonids (Actinopterygii; Teleostei; Salmonidae) to address these questions. There are currently reference genomes for six salmonids of varying phylogenetic distance. I aligned the RADseq data to all six genomes and identified variants with several different genotypers, which were then fed into population genetic analyses. Increasing phylogenetic distance between target species and reference genome reduced the proportion of reads that successfully aligned and mapping quality. Reference genome also influenced the number of SNPs that were generated and depth at those SNPs, although the affect varied by genotyper. Inferences of population structure were mixed: increasing reference genome divergence reduced estimates of differentiation but similar patterns of population relationships were found across scenarios. These findings reveal how the choice of reference genome can influence the output of bioinformatic pipelines. It also emphasizes the need to identify best practices and guidelines for the burgeoning field of biodiversity genomics.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号