首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Advanced resources for genome‐assisted research in barley (Hordeum vulgare) including a whole‐genome shotgun assembly and an integrated physical map have recently become available. These have made possible studies that aim to assess genetic diversity or to isolate single genes by whole‐genome resequencing and in silico variant detection. However such an approach remains expensive given the 5 Gb size of the barley genome. Targeted sequencing of the mRNA‐coding exome reduces barley genomic complexity more than 50‐fold, thus dramatically reducing this heavy sequencing and analysis load. We have developed and employed an in‐solution hybridization‐based sequence capture platform to selectively enrich for a 61.6 megabase coding sequence target that includes predicted genes from the genome assembly of the cultivar Morex as well as publicly available full‐length cDNAs and de novo assembled RNA‐Seq consensus sequence contigs. The platform provides a highly specific capture with substantial and reproducible enrichment of targeted exons, both for cultivated barley and related species. We show that this exome capture platform provides a clear path towards a broader and deeper understanding of the natural variation residing in the mRNA‐coding part of the barley genome and will thus constitute a valuable resource for applications such as mapping‐by‐sequencing and genetic diversity analyzes.  相似文献   

2.
High‐throughput DNA sequencing facilitates the analysis of large portions of the genome in nonmodel organisms, ensuring high accuracy of population genetic parameters. However, empirical studies evaluating the appropriate sample size for these kinds of studies are still scarce. In this study, we use double‐digest restriction‐associated DNA sequencing (ddRADseq) to recover thousands of single nucleotide polymorphisms (SNPs) for two physically isolated populations of Amphirrhox longifolia (Violaceae), a nonmodel plant species for which no reference genome is available. We used resampling techniques to construct simulated populations with a random subset of individuals and SNPs to determine how many individuals and biallelic markers should be sampled for accurate estimates of intra‐ and interpopulation genetic diversity. We identified 3646 and 4900 polymorphic SNPs for the two populations of A. longifolia, respectively. Our simulations show that, overall, a sample size greater than eight individuals has little impact on estimates of genetic diversity within A. longifolia populations, when 1000 SNPs or higher are used. Our results also show that even at a very small sample size (i.e. two individuals), accurate estimates of FST can be obtained with a large number of SNPs (≥1500). These results highlight the potential of high‐throughput genomic sequencing approaches to address questions related to evolutionary biology in nonmodel organisms. Furthermore, our findings also provide insights into the optimization of sampling strategies in the era of population genomics.  相似文献   

3.
A growing variety of “genotype-by-sequencing” (GBS) methods use restriction enzymes and high throughput DNA sequencing to generate data for a subset of genomic loci, allowing the simultaneous discovery and genotyping of thousands of polymorphisms in a set of multiplexed samples. We evaluated a “double-digest” restriction-site associated DNA sequencing (ddRAD-seq) protocol by 1) comparing results for a zebra finch (Taeniopygia guttata) sample with in silico predictions from the zebra finch reference genome; 2) assessing data quality for a population sample of indigobirds (Vidua spp.); and 3) testing for consistent recovery of loci across multiple samples and sequencing runs. Comparison with in silico predictions revealed that 1) over 90% of predicted, single-copy loci in our targeted size range (178–328 bp) were recovered; 2) short restriction fragments (38–178 bp) were carried through the size selection step and sequenced at appreciable depth, generating unexpected but nonetheless useful data; 3) amplification bias favored shorter, GC-rich fragments, contributing to among locus variation in sequencing depth that was strongly correlated across samples; 4) our use of restriction enzymes with a GC-rich recognition sequence resulted in an up to four-fold overrepresentation of GC-rich portions of the genome; and 5) star activity (i.e., non-specific cutting) resulted in thousands of “extra” loci sequenced at low depth. Results for three species of indigobirds show that a common set of thousands of loci can be consistently recovered across both individual samples and sequencing runs. In a run with 46 samples, we genotyped 5,996 loci in all individuals and 9,833 loci in 42 or more individuals, resulting in <1% missing data for the larger data set. We compare our approach to similar methods and discuss the range of factors (fragment library preparation, natural genetic variation, bioinformatics) influencing the recovery of a consistent set of loci among samples.  相似文献   

4.
Genotyping by sequencing (GBS) is a restriction enzyme based targeted approach developed to reduce the genome complexity and discover genetic markers when a priori sequence information is unavailable. Sufficient coverage at each locus is essential to distinguish heterozygous from homozygous sites accurately. The number of GBS samples able to be pooled in one sequencing lane is limited by the number of restriction sites present in the genome and the read depth required at each site per sample for accurate calling of single-nucleotide polymorphisms. Loci bias was observed using a slight modification of the Elshire et al. method: some restriction enzyme sites were represented in higher proportions while others were poorly represented or absent. This bias could be due to the quality of genomic DNA, the endonuclease and ligase reaction efficiency, the distance between restriction sites, the preferential amplification of small library restriction fragments, or bias towards cluster formation of small amplicons during the sequencing process. To overcome these issues, we have developed a GBS method based on randomly tagging genomic DNA (rtGBS). By randomly landing on the genome, we can, with less bias, find restriction sites that are far apart, and undetected by the standard GBS (stdGBS) method. The study comprises two types of biological replicates: six different kiwifruit plants and two independent DNA extractions per plant; and three types of technical replicates: four samples of each DNA extraction, stdGBS vs. rtGBS methods, and two independent library amplifications, each sequenced in separate lanes. A statistically significant unbiased distribution of restriction fragment size by rtGBS showed that this method targeted 49% (39,145) of BamH I sites shared with the reference genome, compared to only 14% (11,513) by stdGBS.  相似文献   

5.
To enable rapid selection of traits in marker‐assisted breeding, markers must be technically simple, low‐cost, high‐throughput and randomly distributed in a genome. We developed such a technology, designated as Multiplex Restriction Amplicon Sequencing (MRASeq), which reduces genome complexity by polymerase chain reaction (PCR) amplification of amplicons flanked by restriction sites. The first PCR primers contain restriction site sequences at 3’‐ends, preceded by 6‐10 bases of specific or degenerate nucleotide sequences and then by a unique M13‐tail sequence which serves as a binding site for a second PCR that adds sequencing primers and barcodes to allow sample multiplexing for sequencing. The sequences of restriction sites and adjacent nucleotides can be altered to suit different species. Physical mapping of MRASeq SNPs from a biparental population of allohexaploid wheat (Triticum aestivum L.) showed a random distribution of SNPs across the genome. MRASeq generated thousands of SNPs from a wheat biparental population and natural populations of wheat and barley (Hordeum vulgare L.). This novel, next‐generation sequencing‐based genotyping platform can be used for linkage mapping to screen quantitative trait loci (QTL), background selection in breeding and many other genetics and breeding applications of various species.  相似文献   

6.
Next‐generation sequencing technologies now allow researchers of non‐model systems to perform genome‐based studies without the requirement of a (often unavailable) closely related genomic reference. We evaluated the role of restriction endonuclease (RE) selection in double‐digest restriction‐site‐associated DNA sequencing (ddRADseq) by generating reduced representation genome‐wide data using four different RE combinations. Our expectation was that RE selections targeting longer, more complex restriction sites would recover fewer loci than RE with shorter, less complex sites. We sequenced a diverse sample of non‐model arachnids, including five congeneric pairs of harvestmen (Opiliones) and four pairs of spiders (Araneae). Sample pairs consisted of either conspecifics or closely related congeneric taxa, and in total 26 sample pair analyses were tested. Sequence demultiplexing, read clustering and variant calling were performed in the pyRAD program. The 6‐base pair cutter EcoRI combined with methylated site‐specific 4‐base pair cutter MspI produced, on average, the greatest numbers of intra‐individual loci and shared loci per sample pair. As expected, the number of shared loci recovered for a sample pair covaried with the degree of genetic divergence, estimated with cytochrome oxidase I sequences, although this relationship was non‐linear. Our comparative results will prove useful in guiding protocol selection for ddRADseq experiments on many arachnid taxa where reference genomes, even from closely related species, are unavailable.  相似文献   

7.
8.
Data from a large‐scale restriction site‐associated DNA sequencing (RAD‐Seq) study of nine butterflyfish species in the Red Sea and Arabian Sea provided a means to test the utility of a recently published draft genome (Chaetodon austriacus) and assess apparent bias in this method of isolating nuclear loci. We here processed double‐digest restriction site‐associated DNA (ddRAD) sequencing data to identify single nucleotide polymorphism (SNP) markers and their associated function with and without our reference genome to see whether it improves the quality of RAD‐Seq. Our analyses indicate (i) a modest gap between the number of nonannotated versus annotated SNPs across all species, (ii) an advantage of using genomic resources for closely related but not distantly related butterflyfish species based on the ability to assign putative gene function to SNPs and (iii) an enrichment of genes among sister butterflyfish taxa related to calcium transmembrane transport and binding. The latter result highlights the potential for this approach to reveal insights into adaptive mechanisms in populations inhabiting challenging coral reef environments such as the Red Sea, Arabian Sea and Arabian Gulf with further study.  相似文献   

9.
10.
High‐throughput sequencing has revolutionized population and conservation genetics. RAD sequencing methods, such as 2b‐RAD, can be used on species lacking a reference genome. However, transferring protocols across taxa can potentially lead to poor results. We tested two different IIB enzymes (AlfI and CspCI) on two species with different genome sizes (the loggerhead turtle Caretta caretta and the sharpsnout seabream Diplodus puntazzo) to build a set of guidelines to improve 2b‐RAD protocols on non‐model organisms while optimising costs. Good results were obtained even with degraded samples, showing the value of 2b‐RAD in studies with poor DNA quality. However, library quality was found to be a critical parameter on the number of reads and loci obtained for genotyping. Resampling analyses with different number of reads per individual showed a trade‐off between number of loci and number of reads per sample. The resulting accumulation curves can be used as a tool to calculate the number of sequences per individual needed to reach a mean depth ≥20 reads to acquire good genotyping results. Finally, we demonstrated that selective‐base ligation does not affect genomic differentiation between individuals, indicating that this technique can be used in species with large genome sizes to adjust the number of loci to the study scope, to reduce sequencing costs and to maintain suitable sequencing depth for a reliable genotyping without compromising the results. Here, we provide a set of guidelines to improve 2b‐RAD protocols on non‐model organisms with different genome sizes, helping decision‐making for a reliable and cost‐effective genotyping.  相似文献   

11.
Massively parallel sequencing a small proportion of the whole genome at high coverage enables answering a wide range of questions from molecular evolution and evolutionary biology to animal and plant breeding and forensics. In this study, we describe the development of restriction‐site associated DNA (RAD) sequencing approach for Ion Torrent PGM platform. Our protocol results in extreme genome complexity reduction using two rare‐cutting restriction enzymes and strict size selection of the library allowing sequencing of a relatively small number of genomic fragments with high sequencing depth. We applied this approach to a common freshwater fish species, the Eurasian perch (Perca fluviatilis L.), and generated over 2.2 MB of novel sequence data consisting of ~17 000 contigs, identified 1259 single nucleotide polymorphisms (SNPs). We also estimated genetic differentiation between the DNA pools from freshwater (Lake Peipus) and brackish water (the Baltic Sea) populations and identified SNPs with the strongest signal of differentiation that could be used for robust individual assignment in the future. This work represents an important step towards developing genomic resources and genetic tools for the Eurasian perch. We expect that our ddRAD sequencing protocol for semiconductor sequencing technology will be useful alternative for currently available RAD protocols.  相似文献   

12.
Restriction‐enzyme‐based sequencing methods enable the genotyping of thousands of single nucleotide polymorphism (SNP) loci in nonmodel organisms. However, in contrast to traditional genetic markers, genotyping error rates in SNPs derived from restriction‐enzyme‐based methods remain largely unknown. Here, we estimated genotyping error rates in SNPs genotyped with double digest RAD sequencing from Mendelian incompatibilities in known mother–offspring dyads of Hoffman's two‐toed sloth (Choloepus hoffmanni) across a range of coverage and sequence quality criteria, for both reference‐aligned and de novo‐assembled data sets. Genotyping error rates were more sensitive to coverage than sequence quality and low coverage yielded high error rates, particularly in de novo‐assembled data sets. For example, coverage ≥5 yielded median genotyping error rates of ≥0.03 and ≥0.11 in reference‐aligned and de novo‐assembled data sets, respectively. Genotyping error rates declined to ≤0.01 in reference‐aligned data sets with a coverage ≥30, but remained ≥0.04 in the de novo‐assembled data sets. We observed approximately 10‐ and 13‐fold declines in the number of loci sampled in the reference‐aligned and de novo‐assembled data sets when coverage was increased from ≥5 to ≥30 at quality score ≥30, respectively. Finally, we assessed the effects of genotyping coverage on a common population genetic application, parentage assignments, and showed that the proportion of incorrectly assigned maternities was relatively high at low coverage. Overall, our results suggest that the trade‐off between sample size and genotyping error rates be considered prior to building sequencing libraries, reporting genotyping error rates become standard practice, and that effects of genotyping errors on inference be evaluated in restriction‐enzyme‐based SNP studies.  相似文献   

13.
Population genetic studies in nonmodel organisms are often hampered by a lack of reference genomes that are essential for whole‐genome resequencing. In the light of this, genotyping methods have been developed to effectively eliminate the need for a reference genome, such as genotyping by sequencing or restriction site‐associated DNA sequencing (RAD‐seq). However, what remains relatively poorly studied is how accurately these methods capture both average and variation in genetic diversity across an organism's genome. In this issue of Molecular Ecology Resources, Dutoit et al. (2016) use whole‐genome resequencing data from the collard flycatcher to assess what factors drive heterogeneity in nucleotide diversity across the genome. Using these data, they then simulate how well different sequencing designs, including RAD sequencing, could capture most of the variation in genetic diversity. They conclude that for evolutionary and conservation‐related studies focused on the estimating genomic diversity, researchers should emphasize the number of loci analysed over the number of individuals sequenced.  相似文献   

14.
15.
Target sequence capture is an efficient technique to enrich specific genomic regions for high‐throughput sequencing in ecological and evolutionary studies. In recent years, many sequence capture approaches have been proposed, but most of them rely on commercial synthetic baits which make the experiment expensive. Here, we present a novel sequence capture approach called AFLP‐based genome sequence capture (AFLP Capture). This method uses the AFLP (amplified fragment length polymorphism) technique to generate homemade capture baits without the need for prior genome information, thus is applicable to any organisms. In this approach, biotinylated AFLP fragments representing a random fraction of the genome are used as baits to capture the homologous fragments from genomic shotgun sequencing libraries. In a trial study, by using AFLP Capture, we successfully obtained 511 orthologous loci (>700,000 bp in total length) from 11 Odorrana species and more than 100,000 single nucleotide polymorphisms (SNPs) in four analyzed individuals of an Odorrana species. This result shows that our method can be used to address questions of various evolutionary depths (from interspecies level to intraspecies level). We also discuss the flexibility in bait preparation and how the sequencing data are analyzed. In summary, AFLP Capture is a rapid and flexible tool and can significantly reduce the experimental cost for phylogenetic studies that require analyzing genome‐scale data (hundreds or thousands of loci).  相似文献   

16.
Paris M  Despres L 《Molecular ecology》2012,21(7):1672-1686
AFLP‐based genome scans are widely used to study the genetics of adaptation and to identify genomic regions potentially under selection. However, this approach usually fails to detect the actual genes or mutations targeted by selection owing to the difficulties of obtaining DNA sequences from AFLP fragments. Here, we combine classical AFLP outlier detection with 454 sequencing of AFLP fragments to obtain sequences from outlier loci. We applied this approach to the study of resistance to Bacillus thuringiensis israelensis (Bti) toxins in the dengue vector Aedes aegypti. A genome scan of Bti‐resistant and Bti‐susceptible A. aegypti laboratory strains was performed based on 432 AFLP markers. Fourteen outliers were detected using two different population genetic algorithms. Out of these, 11 were successfully sequenced. Three contained transposable elements (TEs) sequences, and the 10 outliers that could be mapped at a unique location in the reference genome were located on different supercontigs. One outlier was in the vicinity of a gene coding for an aminopeptidase potentially involved in Bti toxin‐binding. Patterns of sequence variability of this gene showed significant deviation from neutrality in the resistant strain but not in the susceptible strain, even after taking into account the known demographic history of the selected strain. This gene is a promising candidate for future functional analysis.  相似文献   

17.
In recent years, the availability of reduced representation library (RRL) methods has catalysed an expansion of genome‐scale studies to characterize both model and non‐model organisms. Most of these methods rely on the use of restriction enzymes to obtain DNA sequences at a genome‐wide level. These approaches have been widely used to sequence thousands of markers across individuals for many organisms at a reasonable cost, revolutionizing the field of population genomics. However, there are still some limitations associated with these methods, in particular the high molecular weight DNA required as starting material, the reduced number of common loci among investigated samples, and the short length of the sequenced site‐associated DNA. Here, we present MobiSeq, a RRL protocol exploiting simple laboratory techniques, that generates genomic data based on PCR targeted enrichment of transposable elements and the sequencing of the associated flanking region. We validate its performance across 103 DNA extracts derived from three mammalian species: grey wolf (Canis lupus), red deer complex (Cervus sp.) and brown rat (Rattus norvegicus). MobiSeq enables the sequencing of hundreds of thousands loci across the genome and performs SNP discovery with relatively low rates of clonality. Given the ease and flexibility of MobiSeq protocol, the method has the potential to be implemented for marker discovery and population genomics across a wide range of organisms—enabling the exploration of diverse evolutionary and conservation questions.  相似文献   

18.
Ficus erecta, a wild relative of the common fig (F. carica), is a donor of Ceratocystis canker resistance in fig breeding programmes. Interspecific hybridization followed by recurrent backcrossing is an effective method to transfer the resistance trait from wild to cultivated fig. However, this process is time consuming and labour intensive for trees, especially for gynodioecious plants such as fig. In this study, genome resources were developed for F. erecta to facilitate fig breeding programmes. The genome sequence of F. erecta was determined using single‐molecule real‐time sequencing technology. The resultant assembly spanned 331.6 Mb with 538 contigs and an N50 length of 1.9 Mb, from which 51 806 high‐confidence genes were predicted. Pseudomolecule sequences corresponding to the chromosomes of F. erecta were established with a genetic map based on single nucleotide polymorphisms from double‐digest restriction‐site‐associated DNA sequencing. Subsequent linkage analysis and whole‐genome resequencing identified a candidate gene for the Ceratocystis canker resistance trait. Genome‐wide genotyping analysis enabled the selection of female lines that possessed resistance and effective elimination of the donor genome from the progeny. The genome resources provided in this study will accelerate and enhance disease‐resistance breeding programmes in fig.  相似文献   

19.
RenSeq is a NB‐LRR (nucleotide binding‐site leucine‐rich repeat) gene‐targeted, Resistance gene enrichment and sequencing method that enables discovery and annotation of pathogen resistance gene family members in plant genome sequences. We successfully applied RenSeq to the sequenced potato Solanum tuberosum clone DM, and increased the number of identified NB‐LRRs from 438 to 755. The majority of these identified R gene loci reside in poorly or previously unannotated regions of the genome. Sequence and positional details on the 12 chromosomes have been established for 704 NB‐LRRs and can be accessed through a genome browser that we provide. We compared these NB‐LRR genes and the corresponding oligonucleotide baits with the highest sequence similarity and demonstrated that ~80% sequence identity is sufficient for enrichment. Analysis of the sequenced tomato S. lycopersicum ‘Heinz 1706’ extended the NB‐LRR complement to 394 loci. We further describe a methodology that applies RenSeq to rapidly identify molecular markers that co‐segregate with a pathogen resistance trait of interest. In two independent segregating populations involving the wild Solanum species S. berthaultii (Rpi‐ber2) and S. ruiz‐ceballosii (Rpi‐rzc1), we were able to apply RenSeq successfully to identify markers that co‐segregate with resistance towards the late blight pathogen Phytophthora infestans. These SNP identification workflows were designed as easy‐to‐adapt Galaxy pipelines.  相似文献   

20.
Population genetic studies of nonmodel organisms frequently employ reduced representation library (RRL) methodologies, many of which rely on protocols in which genomic DNA is digested by one or more restriction enzymes. However, because high molecular weight DNA is recommended for these protocols, samples with degraded DNA are generally unsuitable for RRL methods. Given that ancient and historic specimens can provide key temporal perspectives to evolutionary questions, we explored how custom‐designed RNA probes could enrich for RRL loci (Restriction Enzyme‐Associated Loci baits, or REALbaits). Starting with genotyping‐by‐sequencing (GBS) data generated on modern common ragweed (Ambrosia artemisiifolia L.) specimens, we designed 20 000 RNA probes to target well‐characterized genomic loci in herbarium voucher specimens dating from 1835 to 1913. Compared to shotgun sequencing, we observed enrichment of the targeted loci at 19‐ to 151‐fold. Using our GBS capture pipeline on a data set of 38 herbarium samples, we discovered 22 813 SNPs, providing sufficient genomic resolution to distinguish geographic populations. For these samples, we found that dilution of REALbaits to 10% of their original concentration still yielded sufficient data for downstream analyses and that a sequencing depth of ~7m reads was sufficient to characterize most loci without wasting sequencing capacity. In addition, we observed that targeted loci had highly variable rates of success, which we primarily attribute to similarity between loci, a trait that ultimately interferes with unambiguous read mapping. Our findings can help researchers design capture experiments for RRL loci, thereby providing an efficient means to integrate samples with degraded DNA into existing RRL data sets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号