首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Target sequence capture is an efficient technique to enrich specific genomic regions for high‐throughput sequencing in ecological and evolutionary studies. In recent years, many sequence capture approaches have been proposed, but most of them rely on commercial synthetic baits which make the experiment expensive. Here, we present a novel sequence capture approach called AFLP‐based genome sequence capture (AFLP Capture). This method uses the AFLP (amplified fragment length polymorphism) technique to generate homemade capture baits without the need for prior genome information, thus is applicable to any organisms. In this approach, biotinylated AFLP fragments representing a random fraction of the genome are used as baits to capture the homologous fragments from genomic shotgun sequencing libraries. In a trial study, by using AFLP Capture, we successfully obtained 511 orthologous loci (>700,000 bp in total length) from 11 Odorrana species and more than 100,000 single nucleotide polymorphisms (SNPs) in four analyzed individuals of an Odorrana species. This result shows that our method can be used to address questions of various evolutionary depths (from interspecies level to intraspecies level). We also discuss the flexibility in bait preparation and how the sequencing data are analyzed. In summary, AFLP Capture is a rapid and flexible tool and can significantly reduce the experimental cost for phylogenetic studies that require analyzing genome‐scale data (hundreds or thousands of loci).  相似文献   

2.
Population genetic studies of nonmodel organisms frequently employ reduced representation library (RRL) methodologies, many of which rely on protocols in which genomic DNA is digested by one or more restriction enzymes. However, because high molecular weight DNA is recommended for these protocols, samples with degraded DNA are generally unsuitable for RRL methods. Given that ancient and historic specimens can provide key temporal perspectives to evolutionary questions, we explored how custom‐designed RNA probes could enrich for RRL loci (Restriction Enzyme‐Associated Loci baits, or REALbaits). Starting with genotyping‐by‐sequencing (GBS) data generated on modern common ragweed (Ambrosia artemisiifolia L.) specimens, we designed 20 000 RNA probes to target well‐characterized genomic loci in herbarium voucher specimens dating from 1835 to 1913. Compared to shotgun sequencing, we observed enrichment of the targeted loci at 19‐ to 151‐fold. Using our GBS capture pipeline on a data set of 38 herbarium samples, we discovered 22 813 SNPs, providing sufficient genomic resolution to distinguish geographic populations. For these samples, we found that dilution of REALbaits to 10% of their original concentration still yielded sufficient data for downstream analyses and that a sequencing depth of ~7m reads was sufficient to characterize most loci without wasting sequencing capacity. In addition, we observed that targeted loci had highly variable rates of success, which we primarily attribute to similarity between loci, a trait that ultimately interferes with unambiguous read mapping. Our findings can help researchers design capture experiments for RRL loci, thereby providing an efficient means to integrate samples with degraded DNA into existing RRL data sets.  相似文献   

3.
Next‐generation sequencing (NGS) is emerging as an efficient and cost‐effective tool in population genomic analyses of nonmodel organisms, allowing simultaneous resequencing of many regions of multi‐genomic DNA from multiplexed samples. Here, we detail our synthesis of protocols for targeted resequencing of mitochondrial and nuclear loci by generating indexed genomic libraries for multiplexing up to 100 individuals in a single sequencing pool, and then enriching the pooled library using custom DNA capture arrays. Our use of DNA sequence from one species to capture and enrich the sequencing libraries of another species (i.e. cross‐species DNA capture) indicates that efficient enrichment occurs when sequences are up to about 12% divergent, allowing us to take advantage of genomic information in one species to sequence orthologous regions in related species. In addition to a complete mitochondrial genome on each array, we have included between 43 and 118 nuclear loci for low‐coverage sequencing of between 18 kb and 87 kb of DNA sequence per individual for single nucleotide polymorphisms discovery from 50 to 100 individuals in a single sequencing lane. Using this method, we have generated a total of over 500 whole mitochondrial genomes from seven cetacean species and green sea turtles. The greater variation detected in mitogenomes relative to short mtDNA sequences is helping to resolve genetic structure ranging from geographic to species‐level differences. These NGS and analysis techniques have allowed for simultaneous population genomic studies of mtDNA and nDNA with greater genomic coverage and phylogeographic resolution than has previously been possible in marine mammals and turtles.  相似文献   

4.
Research in evolutionary biology involving nonmodel organisms is rapidly shifting from using traditional molecular markers such as mtDNA and microsatellites to higher throughput SNP genotyping methodologies to address questions in population genetics, phylogenetics and genetic mapping. Restriction site associated DNA sequencing (RAD sequencing or RADseq) has become an established method for SNP genotyping on Illumina sequencing platforms. Here, we developed a protocol and adapters for double‐digest RAD sequencing for Ion Torrent (Life Technologies; Ion Proton, Ion PGM) semiconductor sequencing. We sequenced thirteen genomic libraries of three different nonmodel vertebrate species on Ion Proton with PI chips: Arctic charr Salvelinus alpinus, European whitefish Coregonus lavaretus and common lizard Zootoca vivipara. This resulted in ~962 million single‐end reads overall and a mean of ~74 million reads per library. We filtered the genomic data using Stacks, a bioinformatic tool to process RAD sequencing data. On average, we obtained ~11 000 polymorphic loci per library of 6–30 individuals. We validate our new method by technical and biological replication, by reconstructing phylogenetic relationships, and using a hybrid genetic cross to track genomic variants. Finally, we discuss the differences between using the different sequencing platforms in the context of RAD sequencing, assessing possible advantages and disadvantages. We show that our protocol can be used for Ion semiconductor sequencing platforms for the rapid and cost‐effective generation of variable and reproducible genetic markers.  相似文献   

5.
Molecular inversion probe (MIP)-based capture is a scalable and effective target-enrichment technology that can use synthetic single-stranded oligonucleotides as probes. Unlike the straightforward use of synthetic oligonucleotides for low-throughput target capture, high-throughput MIP capture has required laborious protocols to generate thousands of single-stranded probes from DNA microarray because of multiple enzymatic steps, gel purifications and extensive PCR amplifications. Here, we developed a simple and efficient microarray-based MIP preparation protocol using only one enzyme with double-stranded probes and improved target capture yields by designing probes with overlapping targets and unique barcodes. To test our strategy, we produced 11 510 microarray-based duplex MIPs (microDuMIPs) and captured 3554 exons of 228 genes in a HapMap genomic DNA sample (NA12878). Under our protocol, capture performance and precision of calling were compatible to conventional MIP capture methods, yet overlapping targets and unique barcodes allowed us to precisely genotype with as little as 50 ng of input genomic DNA without library preparation. microDuMIP method is simpler and cheaper, allowing broader applications and accurate target sequencing with a scalable number of targets.  相似文献   

6.
Combining high‐throughput sequencing with targeted sequence capture has become an attractive tool to study specific genomic regions of interest. Most studies have so far focused on the exome using short‐read technology. These approaches are not designed to capture intergenic regions needed to reconstruct genomic organization, including regulatory regions and gene synteny. Here, we demonstrate the power of combining targeted sequence capture with long‐read sequencing technology for comparative genomic analyses of the haemoglobin (Hb) gene clusters across eight species separated by up to 70 million years. Guided by the reference genome assembly of the Atlantic cod (Gadus morhua) together with genome information from draft assemblies of selected codfishes, we designed probes covering the two Hb gene clusters. Use of custom‐made barcodes combined with PacBio RSII sequencing led to highly continuous assemblies of the LA (~100 kb) and MN (~200 kb) clusters, which include syntenic regions of coding and intergenic sequences. Our results revealed an overall conserved genomic organization of the Hb genes within this lineage, yet with several, lineage‐specific gene duplications. Moreover, for some of the species examined, we identified amino acid substitutions at two sites in the Hbb1 gene as well as length polymorphisms in its regulatory region, which has previously been linked to temperature adaptation in Atlantic cod populations. This study highlights the use of targeted long‐read capture as a versatile approach for comparative genomic studies by generation of a cross‐species genomic resource elucidating the evolutionary history of the Hb gene family across the highly divergent group of codfishes.  相似文献   

7.
8.
Cichlid fishes (family Cichlidae) are models for evolutionary and ecological research. Massively parallel sequencing approaches have been successfully applied to study relatively recent diversification in groups of African and Neotropical cichlids, but such technologies have yet to be used for addressing larger‐scale phylogenetic questions of cichlid evolution. Here, we describe a process for identifying putative single‐copy exons from five African cichlid genomes and sequence the targeted exons for a range of divergent (>tens of millions of years) taxa with probes designed from a single reference species (Oreochromis niloticus, Nile tilapia). Targeted sequencing of 923 exons across 10 cichlid species that represent the family's major lineages and geographic distribution resulted in a complete taxon matrix of 564 exons (649 549 bp), representing 559 genes. Maximum likelihood and Bayesian analyses in both species tree and concatenation frameworks yielded the same fully resolved and highly supported topology, which matched the expected backbone phylogeny of the major cichlid lineages. This work adds to the body of evidence that it is possible to use a relatively divergent reference genome for exon target design and successful capture across a broad phylogenetic range of species. Furthermore, our results show that the use of a third‐party laboratory coupled with accessible bioinformatics tools makes such phylogenomics projects feasible for research groups that lack direct access to genomic facilities. We expect that these resources will be used in further cichlid evolution studies and hope the protocols and identified targets will also be useful for phylogenetic studies of a wider range of organisms.  相似文献   

9.
10.
Chen D  Zhang W  Zhu ZD  Huang Y  Wang P  Zhou BB  Yang XN  Xiao HS  Zhang QH 《遗传》2010,32(12):1296-1303
文章旨在建立一种基因组目标靶序列捕捉文库的方法,并结合第二代测序技术,以实现候选基因区段的深度测序。利用Agilent公司的eArray在线平台,对1250个基因的11824个外显子共2414977bp的基因组序列进行120个碱基长度的捕捉探针(钓饵)设计,并制备成SureSelect液相靶序列捕获试剂。选用2例人基因组DNA,超声打断后末端补平并磷酸化,连接SOLiD接头,回收150bp~200bp的DNA片段,与靶序列探针杂交捕获目标序列,油包水微乳滴PCR扩增后,磁珠分离富集,上SOLiD测序系统通过工作流程分析(WFA)进行文库质量的评价,或正式测序反应。结果显示对所包含的11147个基因外显子片段设计出并合成了46509个捕捉探针,制备成SureSelect试剂盒。探针可有效地捕捉并富集基因组DNA的目标靶片段,定量PCR显示富集效率可达29倍。WFA分析表明文库可以在SOLiD仪器进行正式测序。测序结果显示靶序列区域的测序数占有效总测序数的比例达到70%,覆盖率均在200×以上。结果表明本研究所建立的SureSelect基因组靶序列捕捉、富集建立测序文库的技术路线可行,可直接用于SOLiD测序仪的测序。  相似文献   

11.
The genomics revolution has initiated a new era of population genetics where genome‐wide data are frequently used to understand complex patterns of population structure and selection. However, the application of genomic tools to inform management and conservation has been somewhat rare outside a few well studied species. Fortunately, two recently developed approaches, amplicon sequencing and sequence capture, have the potential to significantly advance the field of conservation genomics. Here, amplicon sequencing refers to highly multiplexed PCR followed by high‐throughput sequencing (e.g., GTseq), and sequence capture refers to using capture probes to isolate loci from reduced‐representation libraries (e.g., Rapture). Both approaches allow sequencing of thousands of individuals at relatively low costs, do not require any specialized equipment for library preparation, and generate data that can be analyzed without sophisticated computational infrastructure. Here, we discuss the advantages and disadvantages of each method and provide a decision framework for geneticists who are looking to integrate these methods into their research programme. While it will always be important to consider the specifics of the biological question and system, we believe that amplicon sequencing is best suited for projects aiming to genotype <500 loci on many individuals (>1,500) or for species where continued monitoring is anticipated (e.g., long‐term pedigrees). Sequence capture, on the other hand, is best applied to projects including fewer individuals or where >500 loci are required. Both of these techniques should smooth the transition from traditional genetic techniques to genomics, helping to usher in the conservation genomics era.  相似文献   

12.
The large genome size of many species hinders the development and application of genomic tools to study them. For instance, loblolly pine (Pinus taeda L.), an ecologically and economically important conifer, has a large and yet uncharacterized genome of 21.7 Gbp. To characterize the pine genome, we performed exome capture and sequencing of 14 729 genes derived from an assembly of expressed sequence tags. Efficiency of sequence capture was evaluated and shown to be similar across samples with increasing levels of complexity, including haploid cDNA, haploid genomic DNA and diploid genomic DNA. However, this efficiency was severely reduced for probes that overlapped multiple exons, presumably because intron sequences hindered probe:exon hybridizations. Such regions could not be entirely avoided during probe design, because of the lack of a reference sequence. To improve the throughput and reduce the cost of sequence capture, a method to multiplex the analysis of up to eight samples was developed. Sequence data showed that multiplexed capture was reproducible among 24 haploid samples, and can be applied for high‐throughput analysis of targeted genes in large populations. Captured sequences were de novo assembled, resulting in 11 396 expanded and annotated gene models, significantly improving the knowledge about the pine gene space. Interspecific capture was also evaluated with over 98% of all probes designed from P. taeda that were efficient in sequence capture, were also suitable for analysis of the related species Pinus elliottii Engelm.  相似文献   

13.
14.
Plastid sequencing is an essential tool in the study of plant evolution. This high‐copy organelle is one of the most technically accessible regions of the genome, and its sequence conservation makes it a valuable region for comparative genome evolution, phylogenetic analysis and population studies. Here, we discuss recent innovations and approaches for de novo plastid assembly that harness genomic tools. We focus on technical developments including low‐cost sequence library preparation approaches for genome skimming, enrichment via hybrid baits and methylation‐sensitive capture, sequence platforms with higher read outputs and longer read lengths, and automated tools for assembly. These developments allow for a much more streamlined assembly than via conventional short‐range PCR. Although newer methods make complete plastid sequencing possible for any land plant or green alga, there are still challenges for producing finished plastomes particularly from herbarium material or from structurally divergent plastids such as those of parasitic plants.  相似文献   

15.
We present a cost‐effective metabarcoding approach, aMPlex Torrent, which relies on an improved multiplex PCR adapted to highly degraded DNA, combining barcoding and next‐generation sequencing to simultaneously analyse many heterogeneous samples. We demonstrate the strength of these improvements by generating a phylochronology through the genotyping of ancient rodent remains from a Moroccan cave whose stratigraphy covers the last 120 000 years. Rodents are important for epidemiology, agronomy and ecological investigations and can act as bioindicators for human‐ and/or climate‐induced environmental changes. Efficient and reliable genotyping of ancient rodent remains has the potential to deliver valuable phylogenetic and paleoecological information. The analysis of multiple ancient skeletal remains of very small size with poor DNA preservation, however, requires a sensitive high‐throughput method to generate sufficient data. We show this approach to be particularly adapted at accessing this otherwise difficult taxonomic and genetic resource. As a highly scalable, lower cost and less labour‐intensive alternative to targeted sequence capture approaches, we propose the aMPlex Torrent strategy to be a useful tool for the genetic analysis of multiple degraded samples in studies involving ecology, archaeology, conservation and evolutionary biology.  相似文献   

16.
Biodiversity, phylogeography and population genetic studies will be revolutionized by access to large data sets thanks to next‐generation sequencing methods. In this study, we develop an easy and cost‐effective protocol for in‐solution enrichment hybridization capture of complete chloroplast genomes applicable at deep‐multiplexed levels. The protocol uses cheap in‐house species‐specific probes developed via long‐range PCR of the entire chloroplast. Barcoded libraries are constructed, and in‐solution enrichment of the chloroplasts is carried out using the probes. This protocol was tested and validated on six economically important West African crop species, namely African rice, pearl millet, three African yam species and fonio. For pearl millet, we also demonstrate the effectiveness of this protocol to retrieve 95% of the sequence of the whole chloroplast on 95 multiplexed individuals in a single MiSeq run at a success rate of 95%. This new protocol allows whole chloroplast genomes to be retrieved at a modest cost and will allow unprecedented resolution for closely related species in phylogeography studies using plastomes.  相似文献   

17.
Flexibility and low cost make genotyping‐by‐sequencing (GBS) an ideal tool for population genomic studies of nonmodel species. However, to utilize the potential of the method fully, many parameters affecting library quality and single nucleotide polymorphism (SNP) discovery require optimization, especially for conifer genomes with a high repetitive DNA content. In this study, we explored strategies for effective GBS analysis in pine species. We constructed GBS libraries using HpaII, PstI and EcoRI‐MseI digestions with different multiplexing levels and examined the effect of restriction enzymes on library complexity and the impact of sequencing depth and size selection of restriction fragments on sequence coverage bias. We tested and compared UNEAK, Stacks and GATK pipelines for the GBS data, and then developed a reference‐free SNP calling strategy for haploid pine genomes. Our GBS procedure proved to be effective in SNP discovery, producing 7000–11 000 and 14 751 SNPs within and among three pine species, respectively, from a PstI library. This investigation provides guidance for the design and analysis of GBS experiments, particularly for organisms for which genomic information is lacking.  相似文献   

18.
ABSTRACT: BACKGROUND: Solution-based targeted genomic enrichment (TGE) protocols permit selective sequencing of genomic regions of interest on a massively parallel scale. These protocols could be improved by: 1) modifying or eliminating time consuming steps; 2) increasing yield to reduce input DNA and excessive PCR cycling; and 3) enhancing reproducible. RESULTS: We developed a solution-based TGE method for downstream Illumina sequencing in a non-automated workflow, adding standard Illumina barcode indexes during the post-hybridization amplification to allow for sample pooling prior to sequencing. The method utilizes Agilent SureSelect baits, primers and hybridization reagents for the capture, off-the-shelf reagents for the library preparation steps, and adaptor oligonucleotides for Illumina paired-end sequencing purchased directly from an oligonucleotide manufacturing company. CONCLUSIONS: This solution-based TGE method for Illumina sequencing is optimized for small- or medium-sized laboratories and addresses the weaknesses of standard protocols by reducing the amount of input DNA required, increasing capture yield, optimizing efficiency, and improving reproducibility.  相似文献   

19.
The low costs of array‐synthesized oligonucleotide libraries are empowering rapid advances in quantitative and synthetic biology. However, high synthesis error rates, uneven representation, and lack of access to individual oligonucleotides limit the true potential of these libraries. We have developed a cost‐effective method called Recombinase Directed Indexing (REDI), which involves integration of a complex library into yeast, site‐specific recombination to index library DNA, and next‐generation sequencing to identify desired clones. We used REDI to generate a library of ~3,300 DNA probes that exhibited > 96% purity and remarkable uniformity (> 95% of probes within twofold of the median abundance). Additionally, we created a collection of ~9,000 individually accessible CRISPR interference yeast strains for > 99% of genes required for either fermentative or respiratory growth, demonstrating the utility of REDI for rapid and cost‐effective creation of strain collections from oligonucleotide pools. Our approach is adaptable to any complex DNA library, and fundamentally changes how these libraries can be parsed, maintained, propagated, and characterized.  相似文献   

20.
Understanding the genetics of biological diversification across micro‐ and macro‐evolutionary time scales is a vibrant field of research for molecular ecologists as rapid advances in sequencing technologies promise to overcome former limitations. In palms, an emblematic, economically and ecologically important plant family with high diversity in the tropics, studies of diversification at the population and species levels are still hampered by a lack of genomic markers suitable for the genotyping of large numbers of recently diverged taxa. To fill this gap, we used a whole genome sequencing approach to develop target sequencing for molecular markers in 4,184 genome regions, including 4,051 genes and 133 non‐genic putatively neutral regions. These markers were chosen to cover a wide range of evolutionary rates allowing future studies at the family, genus, species and population levels. Special emphasis was given to the avoidance of copy number variation during marker selection. In addition, a set of 149 well‐known sequence regions previously used as phylogenetic markers by the palm biological research community were included in the target regions, to open the possibility to combine and jointly analyse already available data sets with genomic data to be produced with this new toolkit. The bait set was effective for species belonging to all three palm sub‐families tested (Arecoideae, Ceroxyloideae and Coryphoideae), with high mapping rates, specificity and efficiency. The number of high‐quality single nucleotide polymorphisms (SNPs) detected at both the sub‐family and population levels facilitates efficient analyses of genomic diversity across micro‐ and macro‐evolutionary time scales.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号