首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
The advent of next‐generation sequencing (NGS) technologies has transformed the way microsatellites are isolated for ecological and evolutionary investigations. Recent attempts to employ NGS for microsatellite discovery have used the 454, Illumina, and Ion Torrent platforms, but other methods including single‐molecule real‐time DNA sequencing (Pacific Biosciences or PacBio) remain viable alternatives. We outline a workflow from sequence quality control to microsatellite marker validation in three plant species using PacBio circular consensus sequencing (CCS). We then evaluate the performance of PacBio CCS in comparison with other NGS platforms for microsatellite isolation, through simulations that focus on variations in read length, read quantity and sequencing error rate. Although quality control of CCS reads reduced microsatellite yield by around 50%, hundreds of microsatellite loci that are expected to have improved conversion efficiency to functional markers were retrieved for each species. The simulations quantitatively validate the advantages of long reads and emphasize the detrimental effects of sequencing errors on NGS‐enabled microsatellite development. In view of the continuing improvement in read length on NGS platforms, sequence quality and the corresponding strategies of quality control will become the primary factors to consider for effective microsatellite isolation. Among current options, PacBio CCS may be optimal for rapid, small‐scale microsatellite development due to its flexibility in scaling sequencing effort, while platforms such as Illumina MiSeq will provide cost‐efficient solutions for multispecies microsatellite projects.  相似文献   

2.
3.
The application of next-generation sequencing (NGS) technologies for the development of simple sequence repeat (SSR) or microsatellite loci for genetic research in the botanical sciences is described. Microsatellite markers are one of the most informative and versatile DNA-based markers used in plant genetic research, but their development has traditionally been a difficult and costly process. NGS technologies allow the efficient identification of large numbers of microsatellites at a fraction of the cost and effort of traditional approaches. The major advantage of NGS methods is their ability to produce large amounts of sequence data from which to isolate and develop numerous genome-wide and gene-based microsatellite loci. The two major NGS technologies with emergent application in SSR isolation are 454 and Illumina. A review is provided of several recent studies demonstrating the efficient use of 454 and Illumina technologies for the discovery of microsatellites in plants. Additionally, important aspects during NGS isolation and development of microsatellites are discussed, including the use of computational tools and high-throughput genotyping methods. A data set of microsatellite loci in the plastome and mitochondriome of cranberry (Vaccinium macrocarpon Ait.) is provided to illustrate a successful application of 454 sequencing for SSR discovery. In the future, NGS technologies will massively increase the number of SSRs and other genetic markers available to conduct genetic research in understudied but economically important crops such as cranberry.  相似文献   

4.
Molecular markers produced by next‐generation sequencing (NGS) technologies are revolutionizing genetic research. However, the costs of analysing large numbers of individual genomes remain prohibitive for most population genetics studies. Here, we present results based on mathematical derivations showing that, under many realistic experimental designs, NGS of DNA pools from diploid individuals allows to estimate the allele frequencies at single nucleotide polymorphisms (SNPs) with at least the same accuracy as individual‐based analyses, for considerably lower library construction and sequencing efforts. These findings remain true when taking into account the possibility of substantially unequal contributions of each individual to the final pool of sequence reads. We propose the intuitive notion of effective pool size to account for unequal pooling and derive a Bayesian hierarchical model to estimate this parameter directly from the data. We provide a user‐friendly application assessing the accuracy of allele frequency estimation from both pool‐ and individual‐based NGS population data under various sampling, sequencing depth and experimental error designs. We illustrate our findings with theoretical examples and real data sets corresponding to SNP loci obtained using restriction site–associated DNA (RAD) sequencing in pool‐ and individual‐based experiments carried out on the same population of the pine processionary moth (Thaumetopoea pityocampa). NGS of DNA pools might not be optimal for all types of studies but provides a cost‐effective approach for estimating allele frequencies for very large numbers of SNPs. It thus allows comparison of genome‐wide patterns of genetic variation for large numbers of individuals in multiple populations.  相似文献   

5.
6.
The development and screening of microsatellite markers have been accelerated by next‐generation sequencing (NGS) technology and in particular GS‐FLX pyro‐sequencing (454). More recent platforms such as the PGM semiconductor sequencer (Ion Torrent) offer potential benefits such as dramatic reductions in cost, but to date have not been well utilized. Here, we critically compare the advantages and disadvantages of microsatellite development using PGM semiconductor sequencing and GS‐FLX pyro‐sequencing for two gymnosperm (a conifer and a cycad) and one angiosperm species. We show that these NGS platforms differ in the quantity of returned sequence data, unique microsatellite data and primer design opportunities, mostly consistent with the differences in read length. The strength of the PGM lies in the large amount of data generated at a comparatively lower cost and time. The strength of GS‐FLX lies in the return of longer average length sequences and therefore greater flexibility in producing markers with variable product length, due to longer flanking regions, which is ideal for capillary multiplexing. These differences need to be considered when choosing a NGS method for microsatellite discovery. However, the ongoing improvement in read lengths of the NGS platforms will reduce the disadvantage of the current short read lengths, particularly for the PGM platform, allowing greater flexibility in primer design coupled with the power of a larger number of sequences.  相似文献   

7.
8.
Genes of the vertebrate major histocompatibility complex (MHC) are of great interest to biologists because of their important role in immunity and disease, and their extremely high levels of genetic diversity. Next generation sequencing (NGS) technologies are quickly becoming the method of choice for high-throughput genotyping of multi-locus templates like MHC in non-model organisms.Previous approaches to genotyping MHC genes using NGS technologies suffer from two problems:1) a “gray zone” where low frequency alleles and high frequency artifacts can be difficult to disentangle and 2) a similar sequence problem, where very similar alleles can be difficult to distinguish as two distinct alleles. Here were present a new method for genotyping MHC loci – Stepwise Threshold Clustering (STC) – that addresses these problems by taking full advantage of the increase in sequence data provided by NGS technologies. Unlike previous approaches for genotyping MHC with NGS data that attempt to classify individual sequences as alleles or artifacts, STC uses a quasi-Dirichlet clustering algorithm to cluster similar sequences at increasing levels of sequence similarity. By applying frequency and similarity based criteria to clusters rather than individual sequences, STC is able to successfully identify clusters of sequences that correspond to individual or similar alleles present in the genomes of individual samples. Furthermore, STC does not require duplicate runs of all samples, increasing the number of samples that can be genotyped in a given project. We show how the STC method works using a single sample library. We then apply STC to 295 threespine stickleback (Gasterosteus aculeatus) samples from four populations and show that neighboring populations differ significantly in MHC allele pools. We show that STC is a reliable, accurate, efficient, and flexible method for genotyping MHC that will be of use to biologists interested in a variety of downstream applications.  相似文献   

9.
二代测序技术的涌现推动了基因组学研究,特别是在疾病相关的遗传变异研究中发挥了重要作用.虽然大多数遗传变异类型都可以借助于各种二代测序分析工具进行检测,但是仍然存在局限性,比如短串联重复序列的长度变异.许多遗传疾病是由短串联重复序列的长度扩张导致的,尤其是亨廷顿病等多种神经系统疾病.然而,现在几乎没有工具能够利用二代测序检测长度大于测序读长的短串联重复序列变异.为了突破这一限制,我们开发了一个全新的方法,该方法基于双末端二代测序辨识短串联重复序列长度变异,并可估计其扩张长度,将其应用于一项基于全外显子组测序的运动神经元疾病临床研究中,成功地鉴定出致病的短串联重复序列长度扩张.该方法首次原创性地利用测序读长覆盖深度特征来解决短串联重复序列变异检测问题,在人类遗传疾病研究中具有广泛的应用价值,并且对于其他二代测序分析方法的开发具有启发性意义.  相似文献   

10.
Type specimens have high scientific importance because they provide the only certain connection between the application of a Linnean name and a physical specimen. Many other individuals may have been identified as a particular species, but their linkage to the taxon concept is inferential. Because type specimens are often more than a century old and have experienced conditions unfavourable for DNA preservation, success in sequence recovery has been uncertain. This study addresses this challenge by employing next‐generation sequencing (NGS) to recover sequences for the barcode region of the cytochrome c oxidase 1 gene from small amounts of template DNA. DNA quality was first screened in more than 1800 century‐old type specimens of Lepidoptera by attempting to recover 164‐bp and 94‐bp reads via Sanger sequencing. This analysis permitted the assignment of each specimen to one of three DNA quality categories – high (164‐bp sequence), medium (94‐bp sequence) or low (no sequence). Ten specimens from each category were subsequently analysed via a PCR‐based NGS protocol requiring very little template DNA. It recovered sequence information from all specimens with average read lengths ranging from 458 bp to 610 bp for the three DNA categories. By sequencing ten specimens in each NGS run, costs were similar to Sanger analysis. Future increases in the number of specimens processed in each run promise substantial reductions in cost, making it possible to anticipate a future where barcode sequences are available from most type specimens.  相似文献   

11.
Small‐scale sequencing has improved substantially in recent decades, culminating in the development of next‐generation sequencing (NGS) technologies. Modern NGS methods have helped the discovery of many new plant viruses. Nevertheless, there is still a need to establish solid assembly pipelines targeting small genomes characterised by low identities to known viral sequences. Here, we describe and discuss the fundamental steps required for discovering and sequencing new plant viral genomes by NGS. A practical pipeline and standard alternative tools used in NGS analysis are presented.  相似文献   

12.
Next-generation sequencing technologies (NGS) have revolutionized biological research by significantly increasing data generation while simultaneously decreasing the time to data output. For many ecologists and evolutionary biologists, the research opportunities afforded by NGS are substantial; even for taxa lacking genomic resources, large-scale genome-level questions can now be addressed, opening up many new avenues of research. While rapid and massive sequencing afforded by NGS increases the scope and scale of many research objectives, whole genome sequencing is often unwarranted and unnecessarily complex for specific research questions. Recently developed targeted sequence enrichment, coupled with NGS, represents a beneficial strategy for enhancing data generation to answer questions in ecology and evolutionary biology. This marriage of technologies offers researchers a simple method to isolate and analyze a few to hundreds, or even thousands, of genes or genomic regions from few to many samples in a relatively efficient and effective manner. These strategies can be applied to questions at both the infra- and interspecific levels, including those involving parentage, gene flow, divergence, phylogenetics, reticulate evolution, and many more. Here we provide a brief overview of targeted sequence enrichment, and emphasize the power of this technology to increase our ability to address a wide range of questions of interest to ecologists and evolutionary biologists, particularly for those working with taxa for which few genomic resources are available.  相似文献   

13.
This article reviews basic concepts,general applications,and the potential impact of next-generation sequencing(NGS)technologies on genomics,with particular reference to currently available and possible future platforms and bioinformatics.NGS technologies have demonstrated the capacity to sequence DNA at unprecedented speed,thereby enabling previously unimaginable scientific achievements and novel biological applications.But,the massive data produced by NGS also presents a significant challenge for data storage,analyses,and management solutions.Advanced bioinformatic tools are essential for the successful application of NGS technology.As evidenced throughout this review,NGS technologies will have a striking impact on genomic research and the entire biological field.With its ability to tackle the unsolved challenges unconquered by previous genomic technologies,NGS is likely to unravel the complexity of the human genome in terms of genetic variations,some of which may be confined to susceptible loci for some common human conditions.The impact of NGS technologies on genomics will be far reaching and likely change the field for years to come.  相似文献   

14.
Next-generation sequencing (NGS) technologies al ow the cost-effective sequencing of whole genomes and have expanded the scope of genomics to novel applications, such as the genome-wide characterizatio...  相似文献   

15.
Early analytical clone screening is important during Chinese hamster ovary (CHO) cell line development of biotherapeutic proteins to select a clonally derived cell line with most favorable stability and product quality. Sensitive sequence confirmation methods using mass spectrometry have limitations in throughput and turnaround time. Next‐generation sequencing (NGS) technologies emerged as alternatives for CHO clone analytics. We report an efficient NGS workflow applying the targeted locus amplification (TLA) strategy for genomic screening of antibody expressing CHO clones. In contrast to previously reported RNA sequencing approaches, TLA allows for targeted sequencing of genomic integrated transgenic DNA without prior locus information, robust detection of single‐nucleotide variants (SNVs) and transgenic rearrangements. During clone selection, TLA/NGS revealed CHO clones with high‐level SNVs within the antibody gene and we report in another case the utility of TLA/NGS to identify rearrangements at transgenic DNA level. We also determined detection limits for SNVs calling and the potential to identify clone contaminations by TLA/NGS. TLA/NGS also allows to identify genetically identical clones. In summary, we demonstrate that TLA/NGS is a robust screening method useful for routine clone analytics during cell line development with the potential to process up to 24 CHO clones in less than 7 workdays.  相似文献   

16.
Genes of the major histocompatibility complex (MHC) are considered a paradigm of adaptive evolution at the molecular level and as such are frequently investigated by evolutionary biologists and ecologists. Accurate genotyping is essential for understanding of the role that MHC variation plays in natural populations, but may be extremely challenging. Here, I discuss the DNA-based methods currently used for genotyping MHC in non-model vertebrates, as well as techniques likely to find widespread use in the future. I also highlight the aspects of MHC structure that are relevant for genotyping, and detail the challenges posed by the complex genomic organization and high sequence variation of MHC loci. Special emphasis is placed on designing appropriate PCR primers, accounting for artefacts and the problem of genotyping alleles from multiple, co-amplifying loci, a strategy which is frequently necessary due to the structure of the MHC. The suitability of typing techniques is compared in various research situations, strategies for efficient genotyping are discussed and areas of likely progress in future are identified. This review addresses the well established typing methods such as the Single Strand Conformation Polymorphism (SSCP), Denaturing Gradient Gel Electrophoresis (DGGE), Reference Strand Conformational Analysis (RSCA) and cloning of PCR products. In addition, it includes the intriguing possibility of direct amplicon sequencing followed by the computational inference of alleles and also next generation sequencing (NGS) technologies; the latter technique may, in the future, find widespread use in typing complex multilocus MHC systems.  相似文献   

17.
Background: Next-generation sequencing (NGS) technologies have fostered an unprecedented proliferation of high-throughput sequencing projects and a concomitant development of novel algorithms for the assembly of short reads. However, numerous technical or computational challenges in de novo assembly still remain, although many new ideas and solutions have been suggested to tackle the challenges in both experimental and computational settings.Results: In this review, we first briefly introduce some of the major challenges faced by NGS sequence assembly. Then, we analyze the characteristics of various sequencing platforms and their impact on assembly results. After that, we classify de novo assemblers according to their frameworks (overlap graph-based, de Bruijn graph-based and string graph-based), and introduce the characteristics of each assembly tool and their adaptation scene. Next, we introduce in detail the solutions to the main challenges of de novo assembly of next generation sequencing data, single-cell sequencing data and single molecule sequencing data. At last, we discuss the application of SMS long reads in solving problems encountered in NGS assembly.Conclusions: This review not only gives an overview of the latest methods and developments in assembly algorithms, but also provides guidelines to determine the optimal assembly algorithm for a given input sequencing data type.  相似文献   

18.
19.
Metabarcoding data generated using next-generation sequencing (NGS) technologies are overwhelmed with rare taxa and skewed in Operational Taxonomic Unit (OTU) frequencies comprised of few dominant taxa. Low frequency OTUs comprise a rare biosphere of singleton and doubleton OTUs, which may include many artifacts. We present an in-depth analysis of global singletons across sixteen NGS libraries representing different ribosomal RNA gene regions, NGS technologies and chemistries. Our data indicate that many singletons (average of 38 % across gene regions) are likely artifacts or potential artifacts, but a large fraction can be assigned to lower taxonomic levels with very high bootstrap support (∼32 % of sequences to genus with ≥90 % bootstrap cutoff). Further, many singletons clustered into rare OTUs from other datasets highlighting their overlap across datasets or the poor performance of clustering algorithms. These data emphasize a need for caution when discarding rare sequence data en masse: such practices may result in throwing the baby out with the bathwater, and underestimating the biodiversity. Yet, the rare sequences are unlikely to greatly affect ecological metrics. As a result, it may be prudent to err on the side of caution and omit rare OTUs prior to downstream analyses.  相似文献   

20.
Next‐generation sequencing (NGS) is increasingly used for diet analyses; however, it may not always describe diet samples well. A reason for this is that diet samples contain mixtures of food DNA in different amounts as well as consumer DNA which can reduce the food DNA characterized. Because of this, detections will depend on the relative amount and identity of each type of DNA. For such samples, diagnostic PCR will most likely give more reliable results, as detection probability is only marginally dependent on other copresent DNA. We investigated the reliability of each method to test (a) whether predatory beetle regurgitates, supposed to be low in consumer DNA, allow to retrieve prey sequences using general barcoding primers that co‐amplify the consumer DNA, and (b) to assess the sequencing depth or replication needed for NGS and diagnostic PCR to give stable results. When consumer DNA is co‐amplified, NGS is better suited to discover the range of possible prey, than for comparing co‐occurrences of diet species between samples, as retested samples were repeatedly different in prey detections with this approach. This shows that samples were incompletely described, as prey detected by diagnostic PCR frequently were missed by NGS. As the sequencing depth needed to reliably describe the diet in such samples becomes very high, the cost‐efficiency and reliability of diagnostic PCR make diagnostic PCR better suited for testing large sample‐sets. Especially if the targeted prey taxa are thought to be of ecological importance, as diagnostic PCR gave more nested and consistent results in repeated testing of the same sample.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号