首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 812 毫秒
1.
Reduced representation genome sequencing such as restriction‐site‐associated DNA (RAD) sequencing is finding increased use to identify and genotype large numbers of single‐nucleotide polymorphisms (SNPs) in model and nonmodel species. We generated a unique resource of novel SNP markers for the European eel using the RAD sequencing approach that was simultaneously identified and scored in a genome‐wide scan of 30 individuals. Whereas genomic resources are increasingly becoming available for this species, including the recent release of a draft genome, no genome‐wide set of SNP markers was available until now. The generated SNPs were widely distributed across the eel genome, aligning to 4779 different contigs and 19 703 different scaffolds. Significant variation was identified, with an average nucleotide diversity of 0.00529 across individuals. Results varied widely across the genome, ranging from 0.00048 to 0.00737 per locus. Based on the average nucleotide diversity across all loci, long‐term effective population size was estimated to range between 132 000 and 1 320 000, which is much higher than previous estimates based on microsatellite loci. The generated SNP resource consisting of 82 425 loci and 376 918 associated SNPs provides a valuable tool for future population genetics and genomics studies and allows for targeting specific genes and particularly interesting regions of the eel genome.  相似文献   

2.
Single nucleotide polymorphisms SNPs are rapidly replacing anonymous markers in population genomic studies, but their use in non model organisms is hampered by the scarcity of cost‐effective approaches to uncover genome‐wide variation in a comprehensive subset of individuals. The screening of one or only a few individuals induces ascertainment bias. To discover SNPs for a population genomic study of the Pyrenean rocket (Sisymbrium austriacum subsp. chrysanthum), we undertook a pooled RAD‐PE (Restriction site Associated DNA Paired‐End sequencing) approach. RAD tags were generated from the PstI‐digested pooled genomic DNA of 12 individuals sampled across the species distribution range and paired‐end sequenced using Illumina technology to produce ~24.5 Mb of sequences, covering ~7% of the specie's genome. Sequences were assembled into ~76 000 contigs with a mean length of 323 bp (N50 = 357 bp, sequencing depth = 24x). In all, >15 000 SNPs were called, of which 47% were annotated in putative genic regions based on homology with the Arabidopsis thaliana genome. Gene ontology (GO) slim categorization demonstrated that the identified SNPs covered extant genic variation well. The validation of 300 SNPs on a larger set of individuals using a KASPar assay underpinned the utility of pooled RAD‐PE as an inexpensive genome‐wide SNP discovery technique (success rate: 87%). In addition to SNPs, we discovered >600 putative SSR markers.  相似文献   

3.
Genes of the major histocompatibility complex (MHC) have received much attention in immunology, genetics, and ecology because they are highly polymorphic and play important roles in parasite resistance and mate choice. Until recently, the MHC of passerine birds was not well-described. However, the genome sequencing of the zebra finch (Taeniopygia guttata) has partially redressed this gap in our knowledge of avian MHC genes. Here, we contribute further to the understanding of the zebra finch MHC organization by mapping SNPs within or close to known MHC genes in the zebra finch genome. MHC class I and IIB genes were both mapped to zebra finch chromosome 16, and there was no evidence that MHC class I genes are located on chromosome 22 (as suggested by the genome assembly). We confirm the location in the MHC region on chromosome 16 for several other genes (BRD2, FLOT1, TRIM7.2, GNB2L1, and CSNK2B). Two of these (CSNK2B and FLOT1) have not previously been mapped in any other bird species. In line with previous results, we also find that orthologs to the immune-related genes B-NK and CLEC2D, which are part of the MHC region in chicken, are situated on zebra finch chromosome Z and not among other MHC genes in the zebra finch.  相似文献   

4.
With the advent of next generation sequencing, new avenues have opened to study genomics in wild populations of non‐model species. Here, we describe a successful approach to a genome‐wide medium density Single Nucleotide Polymorphism (SNP) panel in a non‐model species, the house sparrow (Passer domesticus), through the development of a 10 K Illumina iSelect HD BeadChip. Genomic DNA and cDNA derived from six individuals were sequenced on a 454 GS FLX system and generated a total of 1.2 million sequences, in which SNPs were detected. As no reference genome exists for the house sparrow, we used the zebra finch (Taeniopygia guttata) reference genome to determine the most likely position of each SNP. The 10 000 SNPs on the SNP‐chip were selected to be distributed evenly across 31 chromosomes, giving on average one SNP per 100 000 bp. The SNP‐chip was screened across 1968 individual house sparrows from four island populations. Of the original 10 000 SNPs, 7413 were found to be variable, and 99% of these SNPs were successfully called in at least 93% of all individuals. We used the SNP‐chip to demonstrate the ability of such genome‐wide marker data to detect population sub‐division, and compared these results to similar analyses using microsatellites. The SNP‐chip will be used to map Quantitative Trait Loci (QTL) for fitness‐related phenotypic traits in natural populations.  相似文献   

5.
Here we describe the complete nucleotide sequence of the mitochondrial genome (16 583/4 bp) of the zebra finch (Taeniopygia guttata). Primers were designed based on highly conserved regions of an alignment of three passerine complete mitochondrial DNA (mtDNA) sequences. A combination of overlapping long polymerase chain reaction (PCR) purification, followed by fully nested PCR and sequencing was used to determine the complete mtDNA genome. Six birds, from distinct maternal lineages of a pedigreed population were sequenced. Five novel haplotypes were identified. These sequences provide the first data for sequence variation across the whole mitochondrial genome of a passerine bird species.  相似文献   

6.
Single‐nucleotide polymorphisms (SNPs) are rapidly becoming the standard markers in population genomics studies; however, their use in nonmodel organisms is limited due to the lack of cost‐effective approaches to uncover genome‐wide variation, and the large number of individuals needed in the screening process to reduce ascertainment bias. To discover SNPs for population genomics studies in the fungal symbionts of the mountain pine beetle (MPB), we developed a road map to discover SNPs and to produce a genotyping platform. We undertook a whole‐genome sequencing approach of Leptographium longiclavatum in combination with available genomics resources of another MPB symbiont, Grosmannia clavigera. We sequenced 71 individuals pooled into four groups using the Illumina sequencing technology. We generated between 27 and 30 million reads of 75 bp that resulted in a total of 1, 181 contigs longer than 2 kb and an assembled genome size of 28.9 Mb (N50 = 48 kb, average depth = 125x). A total of 9052 proteins were annotated, and between 9531 and 17 266 SNPs were identified in the four pools. A subset of 206 genes (containing 574 SNPs, 11% false positives) was used to develop a genotyping platform for this species. Using this roadmap, we developed a genotyping assay with a total of 147 SNPs located in 121 genes using the Illumina® Sequenom iPLEX Gold. Our preliminary genotyping (success rate = 85%) of 304 individuals from 36 populations supports the utility of this approach for population genomics studies in other MPB fungal symbionts and other fungal nonmodel species.  相似文献   

7.
Massively parallel sequencing a small proportion of the whole genome at high coverage enables answering a wide range of questions from molecular evolution and evolutionary biology to animal and plant breeding and forensics. In this study, we describe the development of restriction‐site associated DNA (RAD) sequencing approach for Ion Torrent PGM platform. Our protocol results in extreme genome complexity reduction using two rare‐cutting restriction enzymes and strict size selection of the library allowing sequencing of a relatively small number of genomic fragments with high sequencing depth. We applied this approach to a common freshwater fish species, the Eurasian perch (Perca fluviatilis L.), and generated over 2.2 MB of novel sequence data consisting of ~17 000 contigs, identified 1259 single nucleotide polymorphisms (SNPs). We also estimated genetic differentiation between the DNA pools from freshwater (Lake Peipus) and brackish water (the Baltic Sea) populations and identified SNPs with the strongest signal of differentiation that could be used for robust individual assignment in the future. This work represents an important step towards developing genomic resources and genetic tools for the Eurasian perch. We expect that our ddRAD sequencing protocol for semiconductor sequencing technology will be useful alternative for currently available RAD protocols.  相似文献   

8.
The generation of genome‐scale data is critical for a wide range of questions in basic biology using model organisms, but also in questions of applied biology in nonmodel organisms (agriculture, natural resources, conservation and public health biology). Using a genome‐scale approach on a diverse group of nonmodel organisms and with the goal of lowering costs of the method, we modified a multiplexed, high‐throughput genomic scan technique utilizing two restriction enzymes. We analysed several pairs of restriction enzymes and completed double‐digestion RAD sequencing libraries for nine different species and five genera of insects and fish. We found one particular enzyme pair produced consistently higher number of sequence‐able fragments across all nine species. Building libraries off this enzyme pair, we found a range of usable SNPs between 4000 and 37 000 SNPS per species and we found a greater number of usable SNPs using reference genomes than de novo pipelines in STACKS. We also found fewer reads in the Read 2 fragments from the paired‐end Illumina Hiseq run. Overall, the results of this study provide empirical evidence of the utility of this method for producing consistent data for diverse nonmodel species and suggest specific considerations for sequencing analysis strategies.  相似文献   

9.
10.
Research in evolutionary biology involving nonmodel organisms is rapidly shifting from using traditional molecular markers such as mtDNA and microsatellites to higher throughput SNP genotyping methodologies to address questions in population genetics, phylogenetics and genetic mapping. Restriction site associated DNA sequencing (RAD sequencing or RADseq) has become an established method for SNP genotyping on Illumina sequencing platforms. Here, we developed a protocol and adapters for double‐digest RAD sequencing for Ion Torrent (Life Technologies; Ion Proton, Ion PGM) semiconductor sequencing. We sequenced thirteen genomic libraries of three different nonmodel vertebrate species on Ion Proton with PI chips: Arctic charr Salvelinus alpinus, European whitefish Coregonus lavaretus and common lizard Zootoca vivipara. This resulted in ~962 million single‐end reads overall and a mean of ~74 million reads per library. We filtered the genomic data using Stacks, a bioinformatic tool to process RAD sequencing data. On average, we obtained ~11 000 polymorphic loci per library of 6–30 individuals. We validate our new method by technical and biological replication, by reconstructing phylogenetic relationships, and using a hybrid genetic cross to track genomic variants. Finally, we discuss the differences between using the different sequencing platforms in the context of RAD sequencing, assessing possible advantages and disadvantages. We show that our protocol can be used for Ion semiconductor sequencing platforms for the rapid and cost‐effective generation of variable and reproducible genetic markers.  相似文献   

11.
Next-generation sequencing has transformed the fields of ecological and evolutionary genetics by allowing for cost-effective identification of genome-wide variation. Single nucleotide polymorphism (SNP) arrays, or “SNP chips”, enable very large numbers of individuals to be consistently genotyped at a selected set of these identified markers, and also offer the advantage of being able to analyse samples of variable DNA quality. We used reduced representation restriction-aided digest sequencing (RAD-seq) of 31 birds of the threatened hihi (Notiomystis cincta; stitchbird) and low-coverage whole genome sequencing (WGS) of 10 of these birds to develop an Affymetrix 50 K SNP chip. We overcame the limitations of having no hihi reference genome and a low quantity of sequence data by separate and pooled de novo assembly of each of the 10 WGS birds. Reads from all individuals were mapped back to these de novo assemblies to identify SNPs. A subset of RAD-seq and WGS SNPs were selected for inclusion on the chip, prioritising SNPs with the highest quality scores whose flanking sequence uniquely aligned to the zebra finch (Taeniopygia guttata) genome. Of the 58,466 SNPs manufactured on the chip, 72% passed filtering metrics and were polymorphic. By genotyping 1,536 hihi on the array, we found that SNPs detected in multiple assemblies were more likely to successfully genotype, representing a cost-effective approach to identify SNPs for genotyping. Here, we demonstrate the utility of the SNP chip by describing the high rates of linkage disequilibrium in the hihi genome, reflecting the history of population bottlenecks in the species.  相似文献   

12.
High‐throughput DNA sequencing facilitates the analysis of large portions of the genome in nonmodel organisms, ensuring high accuracy of population genetic parameters. However, empirical studies evaluating the appropriate sample size for these kinds of studies are still scarce. In this study, we use double‐digest restriction‐associated DNA sequencing (ddRADseq) to recover thousands of single nucleotide polymorphisms (SNPs) for two physically isolated populations of Amphirrhox longifolia (Violaceae), a nonmodel plant species for which no reference genome is available. We used resampling techniques to construct simulated populations with a random subset of individuals and SNPs to determine how many individuals and biallelic markers should be sampled for accurate estimates of intra‐ and interpopulation genetic diversity. We identified 3646 and 4900 polymorphic SNPs for the two populations of A. longifolia, respectively. Our simulations show that, overall, a sample size greater than eight individuals has little impact on estimates of genetic diversity within A. longifolia populations, when 1000 SNPs or higher are used. Our results also show that even at a very small sample size (i.e. two individuals), accurate estimates of FST can be obtained with a large number of SNPs (≥1500). These results highlight the potential of high‐throughput genomic sequencing approaches to address questions related to evolutionary biology in nonmodel organisms. Furthermore, our findings also provide insights into the optimization of sampling strategies in the era of population genomics.  相似文献   

13.
14.
15.
Restriction‐site‐associated DNA sequencing (RAD‐seq) and related methods are revolutionizing the field of population genomics in nonmodel organisms as they allow generating an unprecedented number of single nucleotide polymorphisms (SNPs) even when no genomic information is available. Yet, RAD‐seq data analyses rely on assumptions on nature and number of nucleotide variants present in a single locus, the choice of which may lead to an under‐ or overestimated number of SNPs and/or to incorrectly called genotypes. Using the Atlantic mackerel (Scomber scombrus L.) and a close relative, the Atlantic chub mackerel (Scomber colias), as case study, here we explore the sensitivity of population structure inferences to two crucial aspects in RAD‐seq data analysis: the maximum number of mismatches allowed to merge reads into a locus and the relatedness of the individuals used for genotype calling and SNP selection. Our study resolves the population structure of the Atlantic mackerel, but, most importantly, provides insights into the effects of alternative RAD‐seq data analysis strategies on population structure inferences that are directly applicable to other species.  相似文献   

16.
Phylogenetic relationships among temperate species of bamboo are difficult to resolve, owing to both the challenge of detecting sufficiently variable markers and their polyploid history. Here, we use restriction site–associated DNA sequencing to identify candidate loci with fixed allelic differences segregating between and within two temperate species of bamboos: Arundinaria faberi and Yushania brevipaniculata. Approximately 27 million paired‐end sequencing reads were generated across four samples. From pooled data, we assembled 67 685 and 70 668 de novo contigs from partial overlap among paired‐end reads, with an average length of 240 and 241 bp for the two species, respectively, which were used to investigate functional classification of RAD tags in a blastx search. Analysed separately by population, we recovered 29 443 putatively orthologous RAD tags shared across the four sampled populations, containing 28 023 sequence variants, of which c. 13 000 are segregating between species, and c. 3000 segregating between populations within each species. Analyses based on these RAD tags yielded robust phylogenetic inferences, even with data set constructed from surprisingly few loci. This study illustrates the potential for reduced‐representation genome data to resolve difficult phylogenetic relationships in temperate bamboos.  相似文献   

17.
Restriction site‐associated DNA sequencing (RAD‐Seq), a next‐generation sequencing‐based genome ‘complexity reduction’ protocol, has been useful in population genomics in species with a reference genome. However, the application of this protocol to natural populations of genomically underinvestigated species, particularly under low‐to‐medium sequencing depth, has not been well justified. In this study, a Bayesian method was developed for calling genotypes from an F2 population of bottle gourd [Lagenaria siceraria (Mol.) Standl.] to construct a high‐density genetic map. Low‐depth genome shotgun sequencing allowed the assembly of scaffolds/contigs comprising approximately 50% of the estimated genome, of which 922 were anchored for identifying syntenic regions between species. RAD‐Seq genotyping of a natural population comprising 80 accessions identified 3226 single nuclear polymorphisms (SNPs), based on which two sub‐gene pools were suggested for association with fruit shape. The two sub‐gene pools were moderately differentiated, as reflected by the Hudson's FST value of 0.14, and they represent regions on LG7 with strikingly elevated FST values. Seven‐fold reduction in heterozygosity and two times increase in LD (r2) were observed in the same region for the round‐fruited sub‐gene pool. Outlier test suggested the locus LX3405 on LG7 to be a candidate site under selection. Comparative genomic analysis revealed that the cucumber genome region syntenic to the high FST island on LG7 harbors an ortholog of the tomato fruit shape gene OVATE. Our results point to a bright future of applying RAD‐Seq to population genomic studies for non‐model species even under low‐to‐medium sequencing efforts. The genomic resources provide valuable information for cucurbit genome research.  相似文献   

18.
Systematic sequencing is the method of choice for generating genomic resources for molecular marker development and candidate gene identification in nonmodel species. We generated 47 357 Sanger ESTs and 2.2M Roche‐454 reads from five cDNA libraries for European beech (Fagus sylvatica L.). This tree species of high ecological and economic value in Europe is among the most representative trees of deciduous broadleaf forests. The sequences generated were assembled into 21 057 contigs with MIRA software. Functional annotations were obtained for 85% of these contigs, from the proteomes of four plant species, Swissprot accessions and the Gene Ontology database. We were able to identify 28 079 in silico SNPs for future marker development. Moreover, RNAseq and qPCR approaches identified genes and gene networks regulated differentially between two critical phenological stages preceding vegetative bud burst (the quiescent and swelling buds stages). According to climatic model‐based projection, some European beech populations may be endangered, particularly at the southern and eastern edges of the European distribution range, which are strongly affected by current climate change. This first genomic resource for the genus Fagus should facilitate the identification of key genes for beech adaptation and management strategies for preserving beech adaptability.  相似文献   

19.
High‐throughput sequencing has revolutionized population and conservation genetics. RAD sequencing methods, such as 2b‐RAD, can be used on species lacking a reference genome. However, transferring protocols across taxa can potentially lead to poor results. We tested two different IIB enzymes (AlfI and CspCI) on two species with different genome sizes (the loggerhead turtle Caretta caretta and the sharpsnout seabream Diplodus puntazzo) to build a set of guidelines to improve 2b‐RAD protocols on non‐model organisms while optimising costs. Good results were obtained even with degraded samples, showing the value of 2b‐RAD in studies with poor DNA quality. However, library quality was found to be a critical parameter on the number of reads and loci obtained for genotyping. Resampling analyses with different number of reads per individual showed a trade‐off between number of loci and number of reads per sample. The resulting accumulation curves can be used as a tool to calculate the number of sequences per individual needed to reach a mean depth ≥20 reads to acquire good genotyping results. Finally, we demonstrated that selective‐base ligation does not affect genomic differentiation between individuals, indicating that this technique can be used in species with large genome sizes to adjust the number of loci to the study scope, to reduce sequencing costs and to maintain suitable sequencing depth for a reliable genotyping without compromising the results. Here, we provide a set of guidelines to improve 2b‐RAD protocols on non‐model organisms with different genome sizes, helping decision‐making for a reliable and cost‐effective genotyping.  相似文献   

20.
Genome scans have made it possible to find outlier markers thought to have been influenced by divergent selection in almost any wild population. However, the lack of genomic information in nonmodel species often makes it difficult to associate these markers with certain genes or chromosome regions. Furthermore, the extent of linkage disequilibrium (LD) in the genome will determine the density of markers required to identify the genes under selection. In this study, we investigated a chromosome region in the willow warbler Phylloscopus trochilus surrounding a single marker previously identified in a genome scan. We first located the marker in the assembled genome of another species, the zebra finch Taeniopygia guttata, and amplified surrounding sequences in Fennoscandian willow warblers. Within an investigated chromosome region of 7.3 Mb as mapped to the zebra finch genome, we observed elevated genetic differentiation between a southern and a northern population across a 2.5-Mb interval comprising numerous coding genes. Within the southern and northern populations, higher values of LD were mostly found between SNPs within the same locus, but extended across distantly situated loci when the analyses were restricted to sampling sites showing intermediate allele frequencies of southern and northern alleles. Our study shows that cross-species genome information is a useful resource to obtain candidate sequences adjacent to outlier markers in nonmodel species.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号