首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 562 毫秒
1.
The identification of mutations in targeted genes has been significantly simplified by the advent of TILLING (Targeting Induced Local Lesions In Genomes), speeding up the functional genomic analysis of animals and plants. Next‐generation sequencing (NGS) is gradually replacing classical TILLING for mutation detection, as it allows the analysis of a large number of amplicons in short durations. The NGS approach was used to identify mutations in a population of Solanum lycopersicum (tomato) that was doubly mutagenized by ethylmethane sulphonate (EMS). Twenty‐five genes belonging to carotenoids and folate metabolism were PCR‐amplified and screened to identify potentially beneficial alleles. To augment efficiency, the 600‐bp amplicons were directly sequenced in a non‐overlapping manner in Illumina MiSeq, obviating the need for a fragmentation step before library preparation. A comparison of the different pooling depths revealed that heterozygous mutations could be identified up to 128‐fold pooling. An evaluation of six different software programs (camba , crisp , gatk unified genotyper , lofreq , snver and vipr ) revealed that no software program was robust enough to predict mutations with high fidelity. Among these, crisp and camba predicted mutations with lower false discovery rates. The false positives were largely eliminated by considering only mutations commonly predicted by two different software programs. The screening of 23.47 Mb of tomato genome yielded 75 predicted mutations, 64 of which were confirmed by Sanger sequencing with an average mutation density of 1/367 Kb. Our results indicate that NGS combined with multiple variant detection tools can reduce false positives and significantly speed up the mutation discovery rate.  相似文献   

2.
Next-generation sequencing (NGS) technologies have transformed genomic research and have the potential to revolutionize clinical medicine. However, the background error rates of sequencing instruments and limitations in targeted read coverage have precluded the detection of rare DNA sequence variants by NGS. Here we describe a method, termed CypherSeq, which combines double-stranded barcoding error correction and rolling circle amplification (RCA)-based target enrichment to vastly improve NGS-based rare variant detection. The CypherSeq methodology involves the ligation of sample DNA into circular vectors, which contain double-stranded barcodes for computational error correction and adapters for library preparation and sequencing. CypherSeq is capable of detecting rare mutations genome-wide as well as those within specific target genes via RCA-based enrichment. We demonstrate that CypherSeq is capable of correcting errors incurred during library preparation and sequencing to reproducibly detect mutations down to a frequency of 2.4 × 10−7 per base pair, and report the frequency and spectra of spontaneous and ethyl methanesulfonate-induced mutations across the Saccharomyces cerevisiae genome.  相似文献   

3.
Next‐generation sequencing (NGS) is emerging as an efficient and cost‐effective tool in population genomic analyses of nonmodel organisms, allowing simultaneous resequencing of many regions of multi‐genomic DNA from multiplexed samples. Here, we detail our synthesis of protocols for targeted resequencing of mitochondrial and nuclear loci by generating indexed genomic libraries for multiplexing up to 100 individuals in a single sequencing pool, and then enriching the pooled library using custom DNA capture arrays. Our use of DNA sequence from one species to capture and enrich the sequencing libraries of another species (i.e. cross‐species DNA capture) indicates that efficient enrichment occurs when sequences are up to about 12% divergent, allowing us to take advantage of genomic information in one species to sequence orthologous regions in related species. In addition to a complete mitochondrial genome on each array, we have included between 43 and 118 nuclear loci for low‐coverage sequencing of between 18 kb and 87 kb of DNA sequence per individual for single nucleotide polymorphisms discovery from 50 to 100 individuals in a single sequencing lane. Using this method, we have generated a total of over 500 whole mitochondrial genomes from seven cetacean species and green sea turtles. The greater variation detected in mitogenomes relative to short mtDNA sequences is helping to resolve genetic structure ranging from geographic to species‐level differences. These NGS and analysis techniques have allowed for simultaneous population genomic studies of mtDNA and nDNA with greater genomic coverage and phylogeographic resolution than has previously been possible in marine mammals and turtles.  相似文献   

4.
Next‐generation sequencing (NGS) methodologies have proven useful in deciphering the food items of generalist predators, but have yet to be applied to gelatinous animal gut and tentacle content. NGS can potentially supplement traditional methods of visual identification. Chrysaora quinquecirrha (Atlantic sea nettle) has progressively become more abundant in Mid‐Atlantic United States’ estuaries including Barnegat Bay (New Jersey), potentially having detrimental effects on both marine organisms and human enterprises. Full characterization of this predator's diet is essential for a comprehensive understanding of its impact on the food web and its management. Here, we tested the efficacy of NGS for prey item determination in the Atlantic sea nettle. We implemented a NGS ‘shotgun’ approach to randomly sequence DNA fragments isolated from gut lavages and gastric pouch/tentacle picks of eight and 84 sea nettles, respectively. These results were verified by visual identification and co‐occurring plankton tows. Over 550 000 contigs were assembled from ~110 million paired‐end reads. Of these, 100 contigs were confidently assigned to 23 different taxa, including soft‐bodied organisms previously undocumented as prey species, including copepods, fish, ctenophores, anemones, amphipods, barnacles, shrimp, polychaete worms, flukes, flatworms, echinoderms, gastropods, bivalves and hemichordates. Our results not only indicate that a ‘shotgun’ NGS approach can supplement visual identification methods, but targeted enrichment of a specific amplicon/gene is not a prerequisite for identifying Atlantic sea nettle prey items.  相似文献   

5.
As researchers begin probing deep coverage sequencing data for increasingly rare mutations and subclonal events, the fidelity of next generation sequencing (NGS) laboratory methods will become increasingly critical. Although error rates for sequencing and polymerase chain reaction (PCR) are well documented, the effects that DNA extraction and other library preparation steps could have on downstream sequence integrity have not been thoroughly evaluated. Here, we describe the discovery of novel C > A/G > T transversion artifacts found at low allelic fractions in targeted capture data. Characteristics such as sequencer read orientation and presence in both tumor and normal samples strongly indicated a non-biological mechanism. We identified the source as oxidation of DNA during acoustic shearing in samples containing reactive contaminants from the extraction process. We show generation of 8-oxoguanine (8-oxoG) lesions during DNA shearing, present analysis tools to detect oxidation in sequencing data and suggest methods to reduce DNA oxidation through the introduction of antioxidants. Further, informatics methods are presented to confidently filter these artifacts from sequencing data sets. Though only seen in a low percentage of reads in affected samples, such artifacts could have profoundly deleterious effects on the ability to confidently call rare mutations, and eliminating other possible sources of artifacts should become a priority for the research community.  相似文献   

6.
Next‐generation sequencing allows access to a large quantity of genomic data. In plants, several studies used whole chloroplast genome sequences for inferring phylogeography or phylogeny. Even though the chloroplast is a haploid organelle, NGS plastome data identified a nonnegligible number of intra‐individual polymorphic SNPs. Such observations could have several causes such as sequencing errors, the presence of heteroplasmy or transfer of chloroplast sequences in the nuclear and mitochondrial genomes. The occurrence of allelic diversity has practical important impacts on the identification of diversity, the analysis of the chloroplast data and beyond that, significant evolutionary questions. In this study, we show that the observed intra‐individual polymorphism of chloroplast sequence data is probably the result of plastid DNA transferred into the mitochondrial and/or the nuclear genomes. We further assess nine different bioinformatics pipelines’ error rates for SNP and genotypes calling using SNPs identified in Sanger sequencing. Specific pipelines are adequate to deal with this issue, optimizing both specificity and sensitivity. Our results will allow a proper use of whole chloroplast NGS sequence and will allow a better handling of NGS chloroplast sequence diversity.  相似文献   

7.
Next‐generation sequencing technologies provide opportunities to understand the genetic basis of phenotypic differences, such as abiotic stress response, even in the closely related cultivars via identification of large number of DNA polymorphisms. We performed whole‐genome resequencing of three rice cultivars with contrasting responses to drought and salinity stress (sensitive IR64, drought‐tolerant Nagina 22 and salinity‐tolerant Pokkali). More than 356 million 90‐bp paired‐end reads were generated, which provided about 85% coverage of the rice genome. Applying stringent parameters, we identified a total of 1 784 583 nonredundant single‐nucleotide polymorphisms (SNPs) and 154 275 InDels between reference (Nipponbare) and the three resequenced cultivars. We detected 401 683 and 662 509 SNPs between IR64 and Pokkali, and IR64 and N22 cultivars, respectively. The distribution of DNA polymorphisms was found to be uneven across and within the rice chromosomes. One‐fourth of the SNPs and InDels were detected in genic regions, and about 3.5% of the total SNPs resulted in nonsynonymous changes. Large‐effect SNPs and InDels, which affect the integrity of the encoded protein, were also identified. Further, we identified DNA polymorphisms present in the differentially expressed genes within the known quantitative trait loci. Among these, a total of 548 SNPs in 232 genes, located in the conserved functional domains, were identified. The data presented in this study provide functional markers and promising target genes for salinity and drought tolerance and present a valuable resource for high‐throughput genotyping and molecular breeding for abiotic stress traits in rice.  相似文献   

8.
Next‐generation sequencing (NGS) provides a powerful tool for the discovery of important genes and alleles in crop plants and their wild relatives. Despite great advances in NGS technologies, whole‐genome shotgun sequencing is cost‐prohibitive for species with complex genomes. An attractive option is to reduce genome complexity to a single chromosome prior to sequencing. This work describes a strategy for studying the genomes of distant wild relatives of wheat by isolating single chromosomes from addition or substitution lines, followed by chromosome sorting using flow cytometry and sequencing of chromosomal DNA by NGS technology. We flow‐sorted chromosome 5Mg from a wheat/Aegilops geniculata disomic substitution line [DS5Mg (5D)] and sequenced it using an Illumina HiSeq 2000 system at approximately 50 × coverage. Paired‐end sequences were assembled and used for structural and functional annotation. A total of 4236 genes were annotated on 5Mg, in close agreement with the predicted number of genes on wheat chromosome 5D (4286). Single‐gene FISH indicated no major chromosomal rearrangements between chromosomes 5Mg and 5D. Comparing chromosome 5Mg with model grass genomes identified synteny blocks in Brachypodium distachyon, rice (Oryza sativa), sorghum (Sorghum bicolor) and barley (Hordeum vulgare). Chromosome 5Mg‐specific SNPs and cytogenetic probe‐based resources were developed and validated. Deletion bin‐mapped and ordered 5Mg SNP markers will be useful to track 5M‐specific introgressions and translocations. This study provides a detailed sequence‐based analysis of the composition of a chromosome from a distant wild relative of bread wheat, and opens up opportunities to develop genomic resources for wild germplasm to facilitate crop improvement.  相似文献   

9.
10.
Here, we present an adaptation of restriction‐site‐associated DNA sequencing (RAD‐seq) to the Illumina HiSeq2000 technology that we used to produce SNP markers in very large quantities at low cost per unit in the Réunion grey white‐eye (Zosterops borbonicus), a nonmodel passerine bird species with no reference genome. We sequenced a set of six pools of 18–25 individuals using a single sequencing lane. This allowed us to build around 600 000 contigs, among which at least 386 000 could be mapped to the zebra finch (Taeniopygia guttata) genome. This yielded more than 80 000 SNPs that could be mapped unambiguously and are evenly distributed across the genome. Thus, our approach provides a good illustration of the high potential of paired‐end RAD sequencing of pooled DNA samples combined with comparative assembly to the zebra finch genome to build large contigs and characterize vast numbers of informative SNPs in nonmodel passerine bird species in a very efficient and cost‐effective way.  相似文献   

11.
Heavy‐ion beams have been widely utilized as a novel and effective mutagen for mutation breeding in diverse plant species, but the induced mutation spectrum is not fully understood at the genome scale. We describe the development of a multiplexed and cost‐efficient whole‐exome sequencing procedure in rice, and its application to characterize an unselected population of heavy‐ion beam‐induced mutations. The bioinformatics pipeline identified single‐nucleotide mutations as well as small and large (>63 kb) insertions and deletions, and showed good agreement with the results obtained with conventional polymerase chain reaction (PCR) and sequencing analyses. We applied the procedure to analyze the mutation spectrum induced by heavy‐ion beams at the population level. In total, 165 individual M2 lines derived from six irradiation conditions as well as eight pools from non‐irradiated ‘Nipponbare’ controls were sequenced using the newly established target exome sequencing procedure. The characteristics and distribution of carbon‐ion beam‐induced mutations were analyzed in the absence of bias introduced by visual mutant selections. The average (±SE) number of mutations within the target exon regions was 9.06 ± 0.37 induced by 150 Gy irradiation of dry seeds. The mutation frequency changed in parallel to the irradiation dose when dry seeds were irradiated. The total number of mutations detected by sequencing unselected M2 lines was correlated with the conventional mutation frequency determined by the occurrence of morphological mutants. Therefore, mutation frequency may be a good indicator for sequencing‐based determination of the optimal irradiation condition for induction of mutations.  相似文献   

12.
The related A genome species of the Oryza genus are the effective gene pool for rice. Here, we report draft genomes for two Australian wild A genome taxa: O. rufipogon‐like population, referred to as Taxon A, and O. meridionalis‐like population, referred to as Taxon B. These two taxa were sequenced and assembled by integration of short‐ and long‐read next‐generation sequencing (NGS) data to create a genomic platform for a wider rice gene pool. Here, we report that, despite the distinct chloroplast genome, the nuclear genome of the Australian Taxon A has a sequence that is much closer to that of domesticated rice (O. sativa) than to the other Australian wild populations. Analysis of 4643 genes in the A genome clade showed that the Australian annual, O. meridionalis, and related perennial taxa have the most divergent (around 3 million years) genome sequences relative to domesticated rice. A test for admixture showed possible introgression into the Australian Taxon A (diverged around 1.6 million years ago) especially from the wild indica/O. nivara clade in Asia. These results demonstrate that northern Australia may be the centre of diversity of the A genome Oryza and suggest the possibility that this might also be the centre of origin of this group and represent an important resource for rice improvement.  相似文献   

13.
A common goal in the discovery of rare functional DNA variants via medical resequencing is to incur a relatively lower proportion of false positive base-calls. We developed a novel statistical method for resequencing arrays (SRMA, sequence robust multi-array analysis) to increase the accuracy of detecting rare variants and reduce the costs in subsequent sequence verifications required in medical applications. SRMA includes single and multi-array analysis and accounts for technical variables as well as the possibility of both low- and high-frequency genomic variation. The confidence of each base-call was ranked using two quality measures. In comparison to Sanger capillary sequencing, we achieved a false discovery rate of 2% (false positive rate 1.2 × 10−5, false negative rate 5%), which is similar to automated second-generation sequencing technologies. Applied to the analysis of 39 nuclear candidate genes in disorders of mitochondrial DNA (mtDNA) maintenance, we confirmed mutations in the DNA polymerase gamma POLG in positive control cases, and identified novel rare variants in previously undiagnosed cases in the mitochondrial topoisomerase TOP1MT, the mismatch repair enzyme MUTYH, and the apurinic-apyrimidinic endonuclease APEX2. Some patients carried rare heterozygous variants in several functionally interacting genes, which could indicate synergistic genetic effects in these clinically similar disorders.  相似文献   

14.
The development of microsatellite loci has become more efficient using next‐generation sequencing (NGS) approaches, and many studies imply that the amount of applicable loci is large. However, few studies have sought to quantify the number of loci that are retained for use out of the thousands of sequence reads initially obtained. We analyzed the success rate of microsatellite loci development for three amphibian species using a 454 NGS approach on tetra‐nucleotide motif‐enriched species‐specific libraries. The number of sequence reads obtained differed strongly between species and ranged from 19,562 for Triturus cristatus to 55,626 for Lissotriton helveticus, with 52,075 reads obtained for Calotriton asper. PHOBOS was used to identify sequences with tetra‐nucleotide repeat motifs with a minimum repeat number of ten and high quality primer binding sites. Of 107 sequences for T. cristatus, 316 for C. asper and 319 for L. helveticus, we tested the amplification success, polymorphism, and degree of heterozygosity for 41 primer combinations each for C. asper and T. cristatus, and 22 for L. helveticus. We found 11 polymorphic loci for T. cristatus, 20 loci for C. asper, and 15 loci for L. helveticus. Extrapolated, the number of potentially amplifiable loci (PALs) resulted in estimated species‐specific success rates of 0.15% (T. cristatus), 0.30% (C. asper), and 0.39% (L. helveticus). Compared with representative Illumina NGS approaches, our applied 454‐sequencing approach on specifically enriched sublibraries proved to be quite competitive in terms of success rates and number of finally applicable loci.  相似文献   

15.
We present the development of a genomic library using RADseq (restriction site associated DNA sequencing) protocol for marker discovery that can be applied on evolutionary studies of the sugarcane borer Diatraea saccharalis, an important South American insect pest. A RADtag protocol combined with Illumina paired‐end sequencing allowed de novo discovery of 12 811 SNPs and a high‐quality assembly of 122.8M paired‐end reads from six individuals, representing 40 Gb of sequencing data. Approximately 1.7 Mb of the sugarcane borer genome distributed over 5289 minicontigs were obtained upon assembly of second reads from first reads RADtag loci where at least one SNP was discovered and genotyped. Minicontig lengths ranged from 200 to 611 bp and were used for functional annotation and microsatellite discovery. These markers will be used in future studies to understand gene flow and adaptation to host plants and control tactics.  相似文献   

16.
Preserving biodiversity is a global challenge requiring data on species’ distribution and abundance over large geographic and temporal scales. However, traditional methods to survey mobile species’ distribution and abundance in marine environments are often inefficient, environmentally destructive, or resource‐intensive. Metabarcoding of environmental DNA (eDNA) offers a new means to assess biodiversity and on much larger scales, but adoption of this approach for surveying whole animal communities in large, dynamic aquatic systems has been slowed by significant unknowns surrounding error rates of detection and relevant spatial resolution of eDNA surveys. Here, we report the results of a 2.5 km eDNA transect surveying the vertebrate fauna present along a gradation of diverse marine habitats associated with a kelp forest ecosystem. Using PCR primers that target the mitochondrial 12S rRNA gene of marine fishes and mammals, we generated eDNA sequence data and compared it to simultaneous visual dive surveys. We find spatial concordance between individual species’ eDNA and visual survey trends, and that eDNA is able to distinguish vertebrate community assemblages from habitats separated by as little as ~60 m. eDNA reliably detected vertebrates with low false‐negative error rates (1/12 taxa) when compared to the surveys, and revealed cryptic species known to occupy the habitats but overlooked by visual methods. This study also presents an explicit accounting of false negatives and positives in metabarcoding data, which illustrate the influence of gene marker selection, replication, contamination, biases impacting eDNA count data and ecology of target species on eDNA detection rates in an open ecosystem.  相似文献   

17.
Generating a contiguous, ordered reference sequence of a complex genome such as hexaploid wheat (2n = 6x = 42; approximately 17 GB) is a challenging task due to its large, highly repetitive, and allopolyploid genome. In wheat, ordering of whole‐genome or hierarchical shotgun sequencing contigs is primarily based on recombination and comparative genomics‐based approaches. However, comparative genomics approaches are limited to syntenic inference and recombination is suppressed within the pericentromeric regions of wheat chromosomes, thus, precise ordering of physical maps and sequenced contigs across the whole‐genome using these approaches is nearly impossible. We developed a whole‐genome radiation hybrid (WGRH) resource and tested it by genotyping a set of 115 randomly selected lines on a high‐density single nucleotide polymorphism (SNP) array. At the whole‐genome level, 26 299 SNP markers were mapped on the RH panel and provided an average mapping resolution of approximately 248 Kb/cR1500 with a total map length of 6866 cR1500. The 7296 unique mapping bins provided a five‐ to eight‐fold higher resolution than genetic maps used in similar studies. Most strikingly, the RH map had uniform bin resolution across the entire chromosome(s), including pericentromeric regions. Our research provides a valuable and low‐cost resource for anchoring and ordering sequenced BAC and next generation sequencing (NGS) contigs. The WGRH developed for reference wheat line Chinese Spring (CS‐WGRH), will be useful for anchoring and ordering sequenced BAC and NGS based contigs for assembling a high‐quality, reference sequence of hexaploid wheat. Additionally, this study provides an excellent model for developing similar resources for other polyploid species.  相似文献   

18.
Biologists frequently sort specimen‐rich samples to species. This process is daunting when based on morphology, and disadvantageous if performed using molecular methods that destroy vouchers (e.g., metabarcoding). An alternative is barcoding every specimen in a bulk sample and then presorting the specimens using DNA barcodes, thus mitigating downstream morphological work on presorted units. Such a “reverse workflow” is too expensive using Sanger sequencing, but we here demonstrate that is feasible with an next‐generation sequencing (NGS) barcoding pipeline that allows for cost‐effective high‐throughput generation of short specimen‐specific barcodes (313 bp of COI; laboratory cost <$0.50 per specimen) through next‐generation sequencing of tagged amplicons. We applied our approach to a large sample of tropical ants, obtaining barcodes for 3,290 of 4,032 specimens (82%). NGS barcodes and their corresponding specimens were then sorted into molecular operational taxonomic units (mOTUs) based on objective clustering and Automated Barcode Gap Discovery (ABGD). High diversity of 88–90 mOTUs (4% clustering) was found and morphologically validated based on preserved vouchers. The mOTUs were overwhelmingly in agreement with morphospecies (match ratio 0.95 at 4% clustering). Because of lack of coverage in existing barcode databases, only 18 could be accurately identified to named species, but our study yielded new barcodes for 48 species, including 28 that are potentially new to science. With its low cost and technical simplicity, the NGS barcoding pipeline can be implemented by a large range of laboratories. It accelerates invertebrate species discovery, facilitates downstream taxonomic work, helps with building comprehensive barcode databases and yields precise abundance information.  相似文献   

19.
Rice bakanae is an important disease that causes serious rice production loss worldwide. We describe a new method for rapid diagnosis of rice bakanae caused by Fusarium fujikuroi and F. proliferatum, based on loop‐mediated isothermal amplification (LAMP) assays. After screening, primers were selected to target FusariumDNA sequences, that is, the intergenic spacer (IGS) region of the nuclear ribosomal operon and reductase‐coding region (RED1) in F. fujikuroi and F. proliferatum, respectively. Both LAMP assays efficiently amplified target genes in 70 min at 62°C. A colour change from purple to sky blue (visible to the unaided eye) was observed in the presence of the DNA of the targeted pathogens only, by adding hydroxynaphthol blue to the reaction system prior to amplification. The minimum of genomic DNA needed in the assays was 67 and 346 pg/μl for F. fujikuroi and F. proliferatum, respectively. Using the two assays described here, we successfully and rapidly diagnosed suspected diseased rice plant and seed samples collected from Jiangsu Province.  相似文献   

20.
The development and screening of microsatellite markers have been accelerated by next‐generation sequencing (NGS) technology and in particular GS‐FLX pyro‐sequencing (454). More recent platforms such as the PGM semiconductor sequencer (Ion Torrent) offer potential benefits such as dramatic reductions in cost, but to date have not been well utilized. Here, we critically compare the advantages and disadvantages of microsatellite development using PGM semiconductor sequencing and GS‐FLX pyro‐sequencing for two gymnosperm (a conifer and a cycad) and one angiosperm species. We show that these NGS platforms differ in the quantity of returned sequence data, unique microsatellite data and primer design opportunities, mostly consistent with the differences in read length. The strength of the PGM lies in the large amount of data generated at a comparatively lower cost and time. The strength of GS‐FLX lies in the return of longer average length sequences and therefore greater flexibility in producing markers with variable product length, due to longer flanking regions, which is ideal for capillary multiplexing. These differences need to be considered when choosing a NGS method for microsatellite discovery. However, the ongoing improvement in read lengths of the NGS platforms will reduce the disadvantage of the current short read lengths, particularly for the PGM platform, allowing greater flexibility in primer design coupled with the power of a larger number of sequences.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号