首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
High‐throughput sequencing (HTS) is central to the study of population genomics and has an increasingly important role in constructing phylogenies. Choices in research design for sequencing projects can include a wide range of factors, such as sequencing platform, depth of coverage and bioinformatic tools. Simulating HTS data better informs these decisions, as users can validate software by comparing output to the known simulation parameters. However, current standalone HTS simulators cannot generate variant haplotypes under even somewhat complex evolutionary scenarios, such as recombination or demographic change. This greatly reduces their usefulness for fields such as population genomics and phylogenomics. Here I present the R package jackalope that simply and efficiently simulates (i) sets of variant haplotypes from a reference genome and (ii) reads from both Illumina and Pacific Biosciences platforms. Haplotypes can be simulated using phylogenies, gene trees, coalescent‐simulation output, population‐genomic summary statistics, and Variant Call Format (VCF) files. jackalope can simulate single, paired‐end or mate‐pair Illumina reads, as well as reads from Pacific Biosciences. These simulations include sequencing errors, mapping qualities, multiplexing and optical/PCR duplicates. It can read reference genomes from fasta files and can simulate new ones, and all outputs can be written to standard file formats. jackalope is available for Mac, Windows and Linux systems.  相似文献   

4.
5.
High‐throughput sequencing methods have become a routine analysis tool in environmental sciences as well as in public and private sector. These methods provide vast amount of data, which need to be analysed in several steps. Although the bioinformatics may be applied using several public tools, many analytical pipelines allow too few options for the optimal analysis for more complicated or customized designs. Here, we introduce PipeCraft, a flexible and handy bioinformatics pipeline with a user‐friendly graphical interface that links several public tools for analysing amplicon sequencing data. Users are able to customize the pipeline by selecting the most suitable tools and options to process raw sequences from Illumina, Pacific Biosciences, Ion Torrent and Roche 454 sequencing platforms. We described the design and options of PipeCraft and evaluated its performance by analysing the data sets from three different sequencing platforms. We demonstrated that PipeCraft is able to process large data sets within 24 hr. The graphical user interface and the automated links between various bioinformatics tools enable easy customization of the workflow. All analytical steps and options are recorded in log files and are easily traceable.  相似文献   

6.
7.
Abstract Dissecting evolutionary dynamics of ecologically important traits is a long‐term challenge for biologists. Attempts to understand natural variation and molecular mechanisms have motivated a move from laboratory model systems to non‐model systems in diverse natural environments. Next generation sequencing methods, along with an expansion of genomic resources and tools, have fostered new links between diverse disciplines, including molecular biology, evolution, ecology, and genomics. Great progress has been made in a few non‐model wild plants, such as Arabidopsis relatives, monkey flowers, and wild sunflowers. Until recently, the lack of comprehensive genomic information has limited evolutionary and ecological studies to larger QTL (quantitative trait locus) regions rather than single gene resolution, and has hindered recognition of general patterns of natural variation and local adaptation. Further efforts in accumulating genomic data and developing bioinformatic and biostatistical tools are now poised to move this field forward. Integrative national and international collaborations and research communities are needed to facilitate development in the field of evolutionary and ecological genomics.  相似文献   

8.
Research in evolutionary biology involving nonmodel organisms is rapidly shifting from using traditional molecular markers such as mtDNA and microsatellites to higher throughput SNP genotyping methodologies to address questions in population genetics, phylogenetics and genetic mapping. Restriction site associated DNA sequencing (RAD sequencing or RADseq) has become an established method for SNP genotyping on Illumina sequencing platforms. Here, we developed a protocol and adapters for double‐digest RAD sequencing for Ion Torrent (Life Technologies; Ion Proton, Ion PGM) semiconductor sequencing. We sequenced thirteen genomic libraries of three different nonmodel vertebrate species on Ion Proton with PI chips: Arctic charr Salvelinus alpinus, European whitefish Coregonus lavaretus and common lizard Zootoca vivipara. This resulted in ~962 million single‐end reads overall and a mean of ~74 million reads per library. We filtered the genomic data using Stacks, a bioinformatic tool to process RAD sequencing data. On average, we obtained ~11 000 polymorphic loci per library of 6–30 individuals. We validate our new method by technical and biological replication, by reconstructing phylogenetic relationships, and using a hybrid genetic cross to track genomic variants. Finally, we discuss the differences between using the different sequencing platforms in the context of RAD sequencing, assessing possible advantages and disadvantages. We show that our protocol can be used for Ion semiconductor sequencing platforms for the rapid and cost‐effective generation of variable and reproducible genetic markers.  相似文献   

9.
Museum specimens provide a wealth of information to biologists, but obtaining genetic data from formalin‐fixed and fluid‐preserved specimens remains challenging. While DNA sequences have been recovered from such specimens, most approaches are time‐consuming and produce low data quality and quantity. Here, we use a modified DNA extraction protocol combined with high‐throughput sequencing to recover DNA from formalin‐fixed and fluid‐preserved snakes that were collected over a century ago and for which little or no modern genetic materials exist in public collections. We successfully extracted DNA and sequenced ultraconserved elements ( = 2318 loci) from 10 fluid‐preserved snakes and included them in a phylogeny with modern samples. This phylogeny demonstrates the general use of such specimens in phylogenomic studies and provides evidence for the placement of enigmatic snakes, such as the rare and never‐before sequenced Indian Xylophis stenorhynchus. Our study emphasizes the relevance of museum collections in modern research and simultaneously provides a protocol that may prove useful for specimens that have been previously intractable for DNA sequencing.  相似文献   

10.
11.
Genetic and genomics tools to characterize host–pathogen interactions are disproportionately directed to the host because of the focus on resistance. However, understanding the genetics of pathogen virulence is equally important and has been limited by the high cost of de novo genotyping of species with limited marker data. Non‐resource‐prohibitive methods that overcome the limitation of genotyping are now available through genotype‐by‐sequencing (GBS). The use of a two‐enzyme restriction‐associated DNA (RAD)‐GBS method adapted for Ion Torrent sequencing technology provided robust and reproducible high‐density genotyping of several fungal species. A total of 5783 and 2373 unique loci, ‘sequence tags’, containing 16 441 and 9992 single nucleotide polymorphisms (SNPs) were identified and characterized from natural populations of Pyrenophora teres f. maculata and Sphaerulina musiva, respectively. The data generated from the P. teres f. maculata natural population were used in association mapping analysis to map the mating‐type gene to high resolution. To further validate the methodology, a biparental population of P. teres f. teres, previously used to develop a genetic map utilizing simple sequence repeat (SSR) and amplified fragment length polymorphism (AFLP) markers, was re‐analysed using the SNP markers generated from this protocol. A robust genetic map containing 1393 SNPs on 997 sequence tags spread across 15 linkage groups with anchored reference markers was generated from the P. teres f. teres biparental population. The robust high‐density markers generated using this protocol will allow positional cloning in biparental fungal populations, association mapping of natural fungal populations and population genetics studies.  相似文献   

12.
13.
High‐throughput sequencing methods for genotyping genome‐wide markers are being rapidly adopted for phylogenetics of nonmodel organisms in conservation and biodiversity studies. However, the reproducibility of SNP genotyping and degree of marker overlap or compatibility between datasets from different methodologies have not been tested in nonmodel systems. Using double‐digest restriction site‐associated DNA sequencing, we sequenced a common set of 22 specimens from the butterfly genus Speyeria on two different Illumina platforms, using two variations of library preparation. We then used a de novo approach to bioinformatic locus assembly and SNP discovery for subsequent phylogenetic analyses. We found a high rate of locus recovery despite differences in library preparation and sequencing platforms, as well as overall high levels of data compatibility after data processing and filtering. These results provide the first application of NGS methods for phylogenetic reconstruction in Speyeria and support the use and long‐term viability of SNP genotyping applications in nonmodel systems.  相似文献   

14.
15.
Restriction site‐associated DNA sequencing (RAD‐Seq), a next‐generation sequencing‐based genome ‘complexity reduction’ protocol, has been useful in population genomics in species with a reference genome. However, the application of this protocol to natural populations of genomically underinvestigated species, particularly under low‐to‐medium sequencing depth, has not been well justified. In this study, a Bayesian method was developed for calling genotypes from an F2 population of bottle gourd [Lagenaria siceraria (Mol.) Standl.] to construct a high‐density genetic map. Low‐depth genome shotgun sequencing allowed the assembly of scaffolds/contigs comprising approximately 50% of the estimated genome, of which 922 were anchored for identifying syntenic regions between species. RAD‐Seq genotyping of a natural population comprising 80 accessions identified 3226 single nuclear polymorphisms (SNPs), based on which two sub‐gene pools were suggested for association with fruit shape. The two sub‐gene pools were moderately differentiated, as reflected by the Hudson's FST value of 0.14, and they represent regions on LG7 with strikingly elevated FST values. Seven‐fold reduction in heterozygosity and two times increase in LD (r2) were observed in the same region for the round‐fruited sub‐gene pool. Outlier test suggested the locus LX3405 on LG7 to be a candidate site under selection. Comparative genomic analysis revealed that the cucumber genome region syntenic to the high FST island on LG7 harbors an ortholog of the tomato fruit shape gene OVATE. Our results point to a bright future of applying RAD‐Seq to population genomic studies for non‐model species even under low‐to‐medium sequencing efforts. The genomic resources provide valuable information for cucurbit genome research.  相似文献   

16.
17.
The use of high‐throughput, low‐density sequencing approaches has dramatically increased in recent years in studies of eco‐evolutionary processes in wild populations and domestication in commercial aquaculture. Most of these studies focus on identifying panels of SNP loci for a single downstream application, whereas there have been few studies examining the trade‐offs for selecting panels of markers for use in multiple applications. Here, we detail the use of a bioinformatic workflow for the development of a dual‐purpose SNP panel for parentage and population assignment, which included identifying putative SNP loci, filtering for the most informative loci for the two tasks, designing effective multiplex PCR primers, optimizing the SNP panel for performance, and performing quality control steps for downstream applications. We applied this workflow to two adjacent Alaskan Sockeye Salmon populations and identified a GTseq panel of 142 SNP loci for parentage and 35 SNP loci for population assignment. Only 50–75 panel loci were necessary for >95% accurate parentage, whereas population assignment success, with all 172 panel loci, ranged from 93.9% to 96.2%. Finally, we discuss the trade‐offs and complexities of the decision‐making process that drives SNP panel development, optimization, and testing.  相似文献   

18.
Molecular markers produced by next‐generation sequencing (NGS) technologies are revolutionizing genetic research. However, the costs of analysing large numbers of individual genomes remain prohibitive for most population genetics studies. Here, we present results based on mathematical derivations showing that, under many realistic experimental designs, NGS of DNA pools from diploid individuals allows to estimate the allele frequencies at single nucleotide polymorphisms (SNPs) with at least the same accuracy as individual‐based analyses, for considerably lower library construction and sequencing efforts. These findings remain true when taking into account the possibility of substantially unequal contributions of each individual to the final pool of sequence reads. We propose the intuitive notion of effective pool size to account for unequal pooling and derive a Bayesian hierarchical model to estimate this parameter directly from the data. We provide a user‐friendly application assessing the accuracy of allele frequency estimation from both pool‐ and individual‐based NGS population data under various sampling, sequencing depth and experimental error designs. We illustrate our findings with theoretical examples and real data sets corresponding to SNP loci obtained using restriction site–associated DNA (RAD) sequencing in pool‐ and individual‐based experiments carried out on the same population of the pine processionary moth (Thaumetopoea pityocampa). NGS of DNA pools might not be optimal for all types of studies but provides a cost‐effective approach for estimating allele frequencies for very large numbers of SNPs. It thus allows comparison of genome‐wide patterns of genetic variation for large numbers of individuals in multiple populations.  相似文献   

19.
20.
While various technologies for high‐throughput genotyping have been developed for ecological studies, simple methods tolerant to low‐quality DNA samples are still limited. In this study, we tested the availability of a random PCR‐based genotyping‐by‐sequencing technology, genotyping by random amplicon sequencing, direct (GRAS‐Di). We focused on population genetic analysis of estuarine mangrove fishes, including two resident species, the Amboina cardinalfish (Fibramia amboinensis, Bleeker, 1853) and the Duncker's river garfish (Zenarchopterus dunckeri, Mohr, 1926), and a marine migrant, the blacktail snapper (Lutjanus fulvus, Forster, 1801). Collections were from the Ryukyu Islands, southern Japan. PCR amplicons derived from ~130 individuals were pooled and sequenced in a single lane on a HiSeq2500 platform, and an average of three million reads was obtained per individual. Consensus contigs were assembled for each species and used for genotyping of single nucleotide polymorphisms by mapping trimmed reads onto the contigs. After quality filtering steps, 4,000–9,000 putative single nucleotide polymorphisms were detected for each species. Although DNA fragmentation can diminish genotyping performance when analysed on next‐generation sequencing technology, the effect was small. Genetic differentiation and a clear pattern of isolation‐by‐distance was observed in F. amboinensis and Z. dunckeri by means of principal component analysis, FST and the admixture analysis. By contrast, L. fulvus comprised a genetically homogeneous population with directional recent gene flow. These genetic differentiation patterns reflect patterns of estuary use through life history. These results showed the power of GRAS‐Di for fine‐grained genetic analysis using field samples, including mangrove fishes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号