首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Next‐generation sequencing (NGS) technologies are revolutionizing the fields of biology and medicine as powerful tools for amplicon sequencing (AS). Using combinations of primers and barcodes, it is possible to sequence targeted genomic regions with deep coverage for hundreds, even thousands, of individuals in a single experiment. This is extremely valuable for the genotyping of gene families in which locus‐specific primers are often difficult to design, such as the major histocompatibility complex (MHC). The utility of AS is, however, limited by the high intrinsic sequencing error rates of NGS technologies and other sources of error such as polymerase amplification or chimera formation. Correcting these errors requires extensive bioinformatic post‐processing of NGS data. Amplicon Sequence Assignment (amplisas ) is a tool that performs analysis of AS results in a simple and efficient way, while offering customization options for advanced users. amplisas is designed as a three‐step pipeline consisting of (i) read demultiplexing, (ii) unique sequence clustering and (iii) erroneous sequence filtering. Allele sequences and frequencies are retrieved in excel spreadsheet format, making them easy to interpret. amplisas performance has been successfully benchmarked against previously published genotyped MHC data sets obtained with various NGS technologies.  相似文献   

2.
To enable rapid selection of traits in marker‐assisted breeding, markers must be technically simple, low‐cost, high‐throughput and randomly distributed in a genome. We developed such a technology, designated as Multiplex Restriction Amplicon Sequencing (MRASeq), which reduces genome complexity by polymerase chain reaction (PCR) amplification of amplicons flanked by restriction sites. The first PCR primers contain restriction site sequences at 3’‐ends, preceded by 6‐10 bases of specific or degenerate nucleotide sequences and then by a unique M13‐tail sequence which serves as a binding site for a second PCR that adds sequencing primers and barcodes to allow sample multiplexing for sequencing. The sequences of restriction sites and adjacent nucleotides can be altered to suit different species. Physical mapping of MRASeq SNPs from a biparental population of allohexaploid wheat (Triticum aestivum L.) showed a random distribution of SNPs across the genome. MRASeq generated thousands of SNPs from a wheat biparental population and natural populations of wheat and barley (Hordeum vulgare L.). This novel, next‐generation sequencing‐based genotyping platform can be used for linkage mapping to screen quantitative trait loci (QTL), background selection in breeding and many other genetics and breeding applications of various species.  相似文献   

3.
Next‐generation sequencing (NGS) is emerging as an efficient and cost‐effective tool in population genomic analyses of nonmodel organisms, allowing simultaneous resequencing of many regions of multi‐genomic DNA from multiplexed samples. Here, we detail our synthesis of protocols for targeted resequencing of mitochondrial and nuclear loci by generating indexed genomic libraries for multiplexing up to 100 individuals in a single sequencing pool, and then enriching the pooled library using custom DNA capture arrays. Our use of DNA sequence from one species to capture and enrich the sequencing libraries of another species (i.e. cross‐species DNA capture) indicates that efficient enrichment occurs when sequences are up to about 12% divergent, allowing us to take advantage of genomic information in one species to sequence orthologous regions in related species. In addition to a complete mitochondrial genome on each array, we have included between 43 and 118 nuclear loci for low‐coverage sequencing of between 18 kb and 87 kb of DNA sequence per individual for single nucleotide polymorphisms discovery from 50 to 100 individuals in a single sequencing lane. Using this method, we have generated a total of over 500 whole mitochondrial genomes from seven cetacean species and green sea turtles. The greater variation detected in mitogenomes relative to short mtDNA sequences is helping to resolve genetic structure ranging from geographic to species‐level differences. These NGS and analysis techniques have allowed for simultaneous population genomic studies of mtDNA and nDNA with greater genomic coverage and phylogeographic resolution than has previously been possible in marine mammals and turtles.  相似文献   

4.
Type specimens have high scientific importance because they provide the only certain connection between the application of a Linnean name and a physical specimen. Many other individuals may have been identified as a particular species, but their linkage to the taxon concept is inferential. Because type specimens are often more than a century old and have experienced conditions unfavourable for DNA preservation, success in sequence recovery has been uncertain. This study addresses this challenge by employing next‐generation sequencing (NGS) to recover sequences for the barcode region of the cytochrome c oxidase 1 gene from small amounts of template DNA. DNA quality was first screened in more than 1800 century‐old type specimens of Lepidoptera by attempting to recover 164‐bp and 94‐bp reads via Sanger sequencing. This analysis permitted the assignment of each specimen to one of three DNA quality categories – high (164‐bp sequence), medium (94‐bp sequence) or low (no sequence). Ten specimens from each category were subsequently analysed via a PCR‐based NGS protocol requiring very little template DNA. It recovered sequence information from all specimens with average read lengths ranging from 458 bp to 610 bp for the three DNA categories. By sequencing ten specimens in each NGS run, costs were similar to Sanger analysis. Future increases in the number of specimens processed in each run promise substantial reductions in cost, making it possible to anticipate a future where barcode sequences are available from most type specimens.  相似文献   

5.
Due to its cost effectiveness, next generation sequencing of pools of individuals (Pool‐Seq) is becoming a popular strategy for genome‐wide estimation of allele frequencies in population samples. As the allele frequency spectrum provides information about past episodes of selection, Pool‐seq is also a promising design for genomic scans for selection. However, no software tool has yet been developed for selection scans based on Pool‐Seq data. We introduce Pool‐hmm, a Python program for the estimation of allele frequencies and the detection of selective sweeps in a Pool‐Seq sample. Pool‐hmm includes several options that allow a flexible analysis of Pool‐Seq data, and can be run in parallel on several processors. Source code and documentation for Pool‐hmm is freely available at https://qgsp.jouy.inra.fr/ .  相似文献   

6.
Cichlid fishes (family Cichlidae) are models for evolutionary and ecological research. Massively parallel sequencing approaches have been successfully applied to study relatively recent diversification in groups of African and Neotropical cichlids, but such technologies have yet to be used for addressing larger‐scale phylogenetic questions of cichlid evolution. Here, we describe a process for identifying putative single‐copy exons from five African cichlid genomes and sequence the targeted exons for a range of divergent (>tens of millions of years) taxa with probes designed from a single reference species (Oreochromis niloticus, Nile tilapia). Targeted sequencing of 923 exons across 10 cichlid species that represent the family's major lineages and geographic distribution resulted in a complete taxon matrix of 564 exons (649 549 bp), representing 559 genes. Maximum likelihood and Bayesian analyses in both species tree and concatenation frameworks yielded the same fully resolved and highly supported topology, which matched the expected backbone phylogeny of the major cichlid lineages. This work adds to the body of evidence that it is possible to use a relatively divergent reference genome for exon target design and successful capture across a broad phylogenetic range of species. Furthermore, our results show that the use of a third‐party laboratory coupled with accessible bioinformatics tools makes such phylogenomics projects feasible for research groups that lack direct access to genomic facilities. We expect that these resources will be used in further cichlid evolution studies and hope the protocols and identified targets will also be useful for phylogenetic studies of a wider range of organisms.  相似文献   

7.
Pollen monitoring is an important and widely used tool in allergy research and creation of awareness in pollen‐allergic patients. Current pollen monitoring methods are microscope‐based, labour intensive and cannot identify pollen to the genus level in some relevant allergenic plant groups. Therefore, a more efficient, cost‐effective and sensitive method is needed. Here, we present a method for identification and quantification of airborne pollen using DNA sequencing. Pollen is collected from ambient air using standard techniques. DNA is extracted from the collected pollen, and a fragment of the chloroplast gene trnL is amplified using PCR. The PCR product is subsequently sequenced on a next‐generation sequencing platform (Ion Torrent). Amplicon molecules are sequenced individually, allowing identification of different sequences from a mixed sample. We show that this method provides an accurate qualitative and quantitative view of the species composition of samples of airborne pollen grains. We also show that it correctly identifies the individual grass genera present in a mixed sample of grass pollen, which cannot be achieved using microscopic pollen identification. We conclude that our method is more efficient and sensitive than current pollen monitoring techniques and therefore has the potential to increase the throughput of pollen monitoring.  相似文献   

8.
The European rabbit (Oryctolagus cuniculus) is a domesticated species with one of the broadest ranges of economic and scientific applications and fields of investigation. Rabbit genome information and assembly are available (oryCun2.0), but so far few studies have investigated its variability, and massive discovery of polymorphisms has not been published yet for this species. Here, we sequenced two reduced representation libraries (RRLs) to identify single nucleotide polymorphisms (SNPs) in the rabbit genome. Genomic DNA of 10 rabbits belonging to different breeds was pooled and digested with two restriction enzymes (HaeIII and RsaI) to create two RRLs which were sequenced using the Ion Torrent Personal Genome Machine. The two RRLs produced 2 917 879 and 4 046 871 reads, for a total of 280.51 Mb (248.49 Mb with quality >20) and 417.28 Mb (360.89 Mb with quality >20) respectively of sequenced DNA. About 90% and 91% respectively of the obtained reads were mapped on the rabbit genome, covering a total of 15.82% of the oryCun2.0 genome version. The mapping and ad hoc filtering procedures allowed to reliably call 62 491 SNPs. SNPs in a few genomic regions were validated by Sanger sequencing. The Variant Effect Predictor Web tool was used to map SNPs on the current version of the rabbit genome. The obtained results will be useful for many applied and basic research programs for this species and will contribute to the development of cost‐effective solutions for high‐throughput SNP genotyping in the rabbit.  相似文献   

9.
10.
Small‐scale sequencing has improved substantially in recent decades, culminating in the development of next‐generation sequencing (NGS) technologies. Modern NGS methods have helped the discovery of many new plant viruses. Nevertheless, there is still a need to establish solid assembly pipelines targeting small genomes characterised by low identities to known viral sequences. Here, we describe and discuss the fundamental steps required for discovering and sequencing new plant viral genomes by NGS. A practical pipeline and standard alternative tools used in NGS analysis are presented.  相似文献   

11.
The computer program exonsampler automates the sampling of thousands of exon sequences from publicly available reference genome sequences and gene annotation databases. It was designed to provide exon sequences for the efficient, next‐generation gene sequencing method called exon capture. The exon sequences can be sampled by a list of gene name abbreviations (e.g. IFNG, TLR1), or by sampling exons from genes spaced evenly across chromosomes. It provides a list of genomic coordinates (a bed file), as well as a set of sequences in fasta format. User‐adjustable parameters for collecting exon sequences include a minimum and maximum acceptable exon length, maximum number of exonic base pairs (bp) to sample per gene, and maximum total bp for the entire collection. It allows for partial sampling of very large exons. It can preferentially sample upstream (5 prime) exons, downstream (3 prime) exons, both external exons, or all internal exons. It is written in the Python programming language using its free libraries. We describe the use of exonsampler to collect exon sequences from the domestic cow (Bos taurus) genome for the design of an exon‐capture microarray to sequence exons from related species, including the zebu cow and wild bison. We collected ~10% of the exome (~3 million bp), including 155 candidate genes, and ~16 000 exons evenly spaced genomewide. We prioritized the collection of 5 prime exons to facilitate discovery and genotyping of SNPs near upstream gene regulatory DNA sequences, which control gene expression and are often under natural selection.  相似文献   

12.
Crop wild relatives (CWR) provide an important source of allelic diversity for any given crop plant species for counteracting the erosion of genetic diversity caused by domestication and elite breeding bottlenecks. Hordeum bulbosum L. is representing the secondary gene pool of the genus Hordeum. It has been used as a source of genetic introgressions for improving elite barley germplasm (Hordeum vulgare L.). However, genetic introgressions from Hbulbosum have yet not been broadly applied, due to a lack of suitable molecular tools for locating, characterizing, and decreasing by recombination and marker‐assisted backcrossing the size of introgressed segments. We applied next‐generation sequencing (NGS) based strategies for unlocking genetic diversity of three diploid introgression lines of cultivated barley containing chromosomal segments of its close relative H. bulbosum. Firstly, exome capture‐based (re)‐sequencing revealed large numbers of single nucleotide polymorphisms (SNPs) enabling the precise allocation of H. bulbosum introgressions. This SNP resource was further exploited by designing a custom multiplex SNP genotyping assay. Secondly, two‐enzyme‐based genotyping‐by‐sequencing (GBS) was employed to allocate the introgressed H. bulbosum segments and to genotype a mapping population. Both methods provided fast and reliable detection and mapping of the introgressed segments and enabled the identification of recombinant plants. Thus, the utilization of H. bulbosum as a resource of natural genetic diversity in barley crop improvement will be greatly facilitated by these tools in the future.  相似文献   

13.
14.
15.
Establishing the sex of individuals in wild systems can be challenging and often requires genetic testing. Genotyping‐by‐sequencing (GBS) and other reduced‐representation DNA sequencing (RRS) protocols (e.g., RADseq, ddRAD) have enabled the analysis of genetic data on an unprecedented scale. Here, we present a novel approach for the discovery and statistical validation of sex‐specific loci in GBS data sets. We used GBS to genotype 166 New Zealand fur seals (NZFS, Arctocephalus forsteri) of known sex. We retained monomorphic loci as potential sex‐specific markers in the locus discovery phase. We then used (i) a sex‐specific locus threshold (SSLT) to identify significantly male‐specific loci within our data set; and (ii) a significant sex‐assignment threshold (SSAT) to confidently assign sex in silico the presence or absence of significantly male‐specific loci to individuals in our data set treated as unknowns (98.9% accuracy for females; 95.8% for males, estimated via cross‐validation). Furthermore, we assigned sex to 86 individuals of true unknown sex using our SSAT and assessed the effect of SSLT adjustments on these assignments. From 90 verified sex‐specific loci, we developed a panel of three sex‐specific PCR primers that we used to ascertain sex independently of our GBS data, which we show amplify reliably in at least two other pinniped species. Using monomorphic loci normally discarded from large SNP data sets is an effective way to identify robust sex‐linked markers for nonmodel species. Our novel pipeline can be used to identify and statistically validate monomorphic and polymorphic sex‐specific markers across a range of species and RRS data sets.  相似文献   

16.
17.
Sequencing pools of individuals rather than individuals separately reduces the costs of estimating allele frequencies at many loci in many populations. Theoretical and empirical studies show that sequencing pools comprising a limited number of individuals (typically fewer than 50) provides reliable allele frequency estimates, provided that the DNA pooling and DNA sequencing steps are carefully controlled. Unequal contributions of different individuals to the DNA pool and the mean and variance in sequencing depth both can affect the standard error of allele frequency estimates. To our knowledge, no study separately investigated the effect of these two factors on allele frequency estimates; so that there is currently no method to a priori estimate the relative importance of unequal individual DNA contributions independently of sequencing depth. We develop a new analytical model for allele frequency estimation that explicitly distinguishes these two effects. Our model shows that the DNA pooling variance in a pooled sequencing experiment depends solely on two factors: the number of individuals within the pool and the coefficient of variation of individual DNA contributions to the pool. We present a new method to experimentally estimate this coefficient of variation when planning a pooled sequencing design where samples are either pooled before or after DNA extraction. Using this analytical and experimental framework, we provide guidelines to optimize the design of pooled sequencing experiments. Finally, we sequence replicated pools of inbred lines of the plant Medicago truncatula and show that the predictions from our model generally hold true when estimating the frequency of known multilocus haplotypes using pooled sequencing.  相似文献   

18.
We isolated and characterized microsatellite loci in Viola mirabilis (Violaceae), an endangered species from South Korea. Twenty‐three polymorphic microsatellite loci were developed and tested in Korean, Chinese and Japanese populations. The number of alleles per locus varied from two to eight. The observed and expected heterozygosities within the three populations were 0.000–0.625 and 0.469–0.695, respectively. A total of six loci in the Korean population, one locus in the Chinese population and seven loci in the Japanese population deviated from Hardy–Weinberg equilibrium. We expect that these newly developed microsatellite markers will contribute to understanding the phylogeography and population genetics of V. mirabilis, which will aid in developing conservation strategies for this species.  相似文献   

19.
The persistence of coral reef ecosystems relies on the symbiotic relationship between scleractinian corals and intracellular, photosynthetic dinoflagellates in the genus Symbiodinium. Genetic evidence indicates that these symbionts are biologically diverse and exhibit discrete patterns of environmental and host distribution. This makes the assessment of Symbiodinium diversity critical to understanding the symbiosis ecology of corals. Here, we applied pyrosequencing to the elucidation of Symbiodinium diversity via analysis of the internal transcribed spacer 2 (ITS2) region, a multicopy genetic marker commonly used to analyse Symbiodinium diversity. Replicated data generated from isoclonal Symbiodinium cultures showed that all genomes contained numerous, yet mostly rare, ITS2 sequence variants. Pyrosequencing data were consistent with more traditional denaturing gradient gel electrophoresis (DGGE) approaches to the screening of ITS2 PCR amplifications, where the most common sequences appeared as the most intense bands. Further, we developed an operational taxonomic unit (OTU)‐based pipeline for Symbiodinium ITS2 diversity typing to provisionally resolve ecologically discrete entities from intragenomic variation. A genetic distance cut‐off of 0.03 collapsed intragenomic ITS2 variants of isoclonal cultures into single OTUs. When applied to the analysis of field‐collected coral samples, our analyses confirm that much of the commonly observed Symbiodinium ITS2 diversity can be attributed to intragenomic variation. We conclude that by analysing Symbiodinium populations in an OTU‐based framework, we can improve objectivity, comparability and simplicity when assessing ITS2 diversity in field‐based studies.  相似文献   

20.
The choice of technology and bioinformatics approach is critical in obtaining accurate and reliable information from next‐generation sequencing (NGS) experiments. An increasing number of software and methodological guidelines are being published, but deciding upon which approach and experimental design to use can depend on the particularities of the species and on the aims of the study. This leaves researchers unable to produce informed decisions on these central questions. To address these issues, we developed pipeliner – a tool to evaluate, by simulation, the performance of NGS pipelines in resequencing studies. Pipeliner provides a graphical interface allowing the users to write and test their own bioinformatics pipelines with publicly available or custom software. It computes a number of statistics summarizing the performance in SNP calling, including the recovery, sensitivity and false discovery rate for heterozygous and homozygous SNP genotypes. Pipeliner can be used to answer many practical questions, for example, for a limited amount of NGS effort, how many more reliable SNPs can be detected by doubling coverage and halving sample size or what is the false discovery rate provided by different SNP calling algorithms and options. Pipeliner thus allows researchers to carefully plan their study's sampling design and compare the suitability of alternative bioinformatics approaches for their specific study systems. Pipeliner is written in C++ and is freely available from http://github.com/brunonevado/Pipeliner .  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号