共查询到20条相似文献,搜索用时 15 毫秒
1.
Genetic sex assignment in wild populations using genotyping‐by‐sequencing data: A statistical threshold approach 下载免费PDF全文
William R. Stovall Helen R. Taylor Michael Black Stefanie Grosser Kim Rutherford Neil J. Gemmell 《Molecular ecology resources》2018,18(2):179-190
Establishing the sex of individuals in wild systems can be challenging and often requires genetic testing. Genotyping‐by‐sequencing (GBS) and other reduced‐representation DNA sequencing (RRS) protocols (e.g., RADseq, ddRAD) have enabled the analysis of genetic data on an unprecedented scale. Here, we present a novel approach for the discovery and statistical validation of sex‐specific loci in GBS data sets. We used GBS to genotype 166 New Zealand fur seals (NZFS, Arctocephalus forsteri) of known sex. We retained monomorphic loci as potential sex‐specific markers in the locus discovery phase. We then used (i) a sex‐specific locus threshold (SSLT) to identify significantly male‐specific loci within our data set; and (ii) a significant sex‐assignment threshold (SSAT) to confidently assign sex in silico the presence or absence of significantly male‐specific loci to individuals in our data set treated as unknowns (98.9% accuracy for females; 95.8% for males, estimated via cross‐validation). Furthermore, we assigned sex to 86 individuals of true unknown sex using our SSAT and assessed the effect of SSLT adjustments on these assignments. From 90 verified sex‐specific loci, we developed a panel of three sex‐specific PCR primers that we used to ascertain sex independently of our GBS data, which we show amplify reliably in at least two other pinniped species. Using monomorphic loci normally discarded from large SNP data sets is an effective way to identify robust sex‐linked markers for nonmodel species. Our novel pipeline can be used to identify and statistically validate monomorphic and polymorphic sex‐specific markers across a range of species and RRS data sets. 相似文献
2.
Wubishet A. Bekele Charlene P. Wight Shiaoman Chao Catherine J. Howarth Nicholas A. Tinker 《Plant biotechnology journal》2018,16(8):1452-1463
In a de novo genotyping‐by‐sequencing (GBS) analysis of short, 64‐base tag‐level haplotypes in 4657 accessions of cultivated oat, we discovered 164741 tag‐level (TL) genetic variants containing 241224 SNPs. From this, the marker density of an oat consensus map was increased by the addition of more than 70000 loci. The mapped TL genotypes of a 635‐line diversity panel were used to infer chromosome‐level (CL) haplotype maps. These maps revealed differences in the number and size of haplotype blocks, as well as differences in haplotype diversity between chromosomes and subsets of the diversity panel. We then explored potential benefits of SNP vs. TL vs. CL GBS variants for mapping, high‐resolution genome analysis and genomic selection in oats. A combined genome‐wide association study (GWAS) of heading date from multiple locations using both TL haplotypes and individual SNP markers identified 184 significant associations. A comparative GWAS using TL haplotypes, CL haplotype blocks and their combinations demonstrated the superiority of using TL haplotype markers. Using a principal component‐based genome‐wide scan, genomic regions containing signatures of selection were identified. These regions may contain genes that are responsible for the local adaptation of oats to Northern American conditions. Genomic selection for heading date using TL haplotypes or SNP markers gave comparable and promising prediction accuracies of up to r = 0.74. Genomic selection carried out in an independent calibration and test population for heading date gave promising prediction accuracies that ranged between r = 0.42 and 0.67. In conclusion, TL haplotype GBS‐derived markers facilitate genome analysis and genomic selection in oat. 相似文献
3.
Population and phylogenomic decomposition via genotyping‐by‐sequencing in Australian Pelargonium 下载免费PDF全文
Adrienne B. Nicotra Caroline Chong Jason G. Bragg Chong Ren Ong Nicola C. Aitken Aaron Chuah Brendan Lepschi Justin O. Borevitz 《Molecular ecology》2016,25(9):2000-2014
Species delimitation has seen a paradigm shift as increasing accessibility of genomic‐scale data enables separation of lineages with convergent morphological traits and the merging of recently diverged ecotypes that have distinguishing characteristics. We inferred the process of lineage formation among Australian species in the widespread and highly variable genus Pelargonium by combining phylogenomic and population genomic analyses along with breeding system studies and character analysis. Phylogenomic analysis and population genetic clustering supported seven of the eight currently described species but provided little evidence for differences in genetic structure within the most widely distributed group that containing P. australe. In contrast, morphometric analysis detected three deep lineages within Australian Pelargonium; with P. australe consisting of five previously unrecognized entities occupying separate geographic ranges. The genomic approach enabled elucidation of parallel evolution in some traits formerly used to delineate species, as well as identification of ecotypic morphological differentiation within recognized species. Highly variable morphology and trait convergence each contribute to the discordance between phylogenomic relationships and morphological taxonomy. Data suggest that genetic divergence among species within the Australian Pelargonium may result from allopatric speciation while morphological differentiation within and among species may be more strongly driven by environmental differences. 相似文献
4.
Shifa Xiong Yunxiao Zhao Yicun Chen Ming Gao Liwen Wu Yangdong Wang 《Ecology and evolution》2020,10(16):8949-8958
Analysis of genetic diversity and population structure among Quercus fabri populations is essential for the conservation and utilization of Q. fabri resources. Here, the genetic diversity and structure of 158 individuals from 13 natural populations of Quercus fabri in China were analyzed using genotyping‐by‐sequencing (GBS). A total of 459,564 high‐quality single nucleotide polymorphisms (SNPs) were obtained after filtration for subsequent analysis. Genetic structure analysis revealed that these individuals can be clustered into two groups and the structure can be explained mainly by the geographic barrier, showed gene introgression from coastal to inland areas and high mountains could significantly hinder the mutual introgression of genes. Genetic diversity analysis indicated that the individual differences within groups are greater than the differences between the two groups. These results will help us better understand the genetic backgrounds of Q. fabri. 相似文献
5.
Optimization of the genotyping‐by‐sequencing strategy for population genomic analysis in conifers 下载免费PDF全文
Jin Pan Baosheng Wang Zhi‐Yong Pei Wei Zhao Jie Gao Jian‐Feng Mao Xiao‐Ru Wang 《Molecular ecology resources》2015,15(4):711-722
Flexibility and low cost make genotyping‐by‐sequencing (GBS) an ideal tool for population genomic studies of nonmodel species. However, to utilize the potential of the method fully, many parameters affecting library quality and single nucleotide polymorphism (SNP) discovery require optimization, especially for conifer genomes with a high repetitive DNA content. In this study, we explored strategies for effective GBS analysis in pine species. We constructed GBS libraries using HpaII, PstI and EcoRI‐MseI digestions with different multiplexing levels and examined the effect of restriction enzymes on library complexity and the impact of sequencing depth and size selection of restriction fragments on sequence coverage bias. We tested and compared UNEAK, Stacks and GATK pipelines for the GBS data, and then developed a reference‐free SNP calling strategy for haploid pine genomes. Our GBS procedure proved to be effective in SNP discovery, producing 7000–11 000 and 14 751 SNPs within and among three pine species, respectively, from a PstI library. This investigation provides guidance for the design and analysis of GBS experiments, particularly for organisms for which genomic information is lacking. 相似文献
6.
Paralogs are revealed by proportion of heterozygotes and deviations in read ratios in genotyping‐by‐sequencing data from natural populations 下载免费PDF全文
Garrett J. McKinney Ryan K. Waples Lisa W. Seeb James E. Seeb 《Molecular ecology resources》2017,17(4):656-669
Whole‐genome duplications have occurred in the recent ancestors of many plants, fish, and amphibians, resulting in a pervasiveness of paralogous loci and the potential for both disomic and tetrasomic inheritance in the same genome. Paralogs can be difficult to reliably genotype and are often excluded from genotyping‐by‐sequencing (GBS) analyses; however, removal requires paralogs to be identified which is difficult without a reference genome. We present a method for identifying paralogs in natural populations by combining two properties of duplicated loci: (i) the expected frequency of heterozygotes exceeds that for singleton loci, and (ii) within heterozygotes, observed read ratios for each allele in GBS data will deviate from the 1:1 expected for singleton (diploid) loci. These deviations are often not apparent within individuals, particularly when sequence coverage is low; but, we postulated that summing allele reads for each locus over all heterozygous individuals in a population would provide sufficient power to detect deviations at those loci. We identified paralogous loci in three species: Chinook salmon (Oncorhynchus tshawytscha) which retains regions with ongoing residual tetrasomy on eight chromosome arms following a recent whole‐genome duplication, mountain barberry (Berberis alpina) which has a large proportion of paralogs that arose through an unknown mechanism, and dusky parrotfish (Scarus niger) which has largely rediploidized following an ancient whole‐genome duplication. Importantly, this approach only requires the genotype and allele‐specific read counts for each individual, information which is readily obtained from most GBS analysis pipelines. 相似文献
7.
GIbPSs: a toolkit for fast and accurate analyses of genotyping‐by‐sequencing data without a reference genome 下载免费PDF全文
Genotyping‐by‐sequencing (GBS) and related methods are increasingly used for studies of non‐model organisms from population genetic to phylogenetic scales. We present GIbPSs, a new genotyping toolkit for the analysis of data from various protocols such as RAD, double‐digest RAD, GBS, and two‐enzyme GBS without a reference genome. GIbPSs can handle paired‐end GBS data and is able to assign reads from both strands of a restriction fragment to the same locus. GIbPSs is most suitable for population genetic and phylogeographic analyses. It avoids genotyping errors due to indel variation by identifying and discarding affected loci. GIbPSs creates a genotype database that offers rich functionality for data filtering and export in numerous formats. We performed comparative analyses of simulated and real GBS data with GIbPSs and another program, pyRAD. This program accounts for indel variation by aligning homologous sequences. GIbPSs performed better than pyRAD in several aspects. It required much less computation time and displayed higher genotyping accuracy. GIbPSs retained smaller numbers of loci overall in analyses of real GBS data. It nevertheless delivered more complete genotype matrices with greater locus overlap between individuals and greater numbers of loci sampled in all individuals. 相似文献
8.
Resolving allele dosage in duplicated loci using genotyping‐by‐sequencing data: A path forward for population genetic analysis 下载免费PDF全文
Garrett J. McKinney Ryan K. Waples Carita E. Pascal Lisa W. Seeb James E. Seeb 《Molecular ecology resources》2018,18(3):570-579
Whole‐genome duplications have occurred in the recent ancestors of many plants, fish and amphibians. Signals of these whole‐genome duplications still exist in the form of paralogous loci. Recent advances have allowed reliable identification of paralogs in genotyping‐by‐sequencing (GBS) data such as that generated from restriction‐site‐associated DNA sequencing (RADSeq); however, excluding paralogs from analyses is still routine due to difficulties in genotyping. This exclusion of paralogs may filter a large fraction of loci, including loci that may be adaptively important or informative for population genetic analyses. We present a maximum‐likelihood method for inferring allele dosage in paralogs and assess its accuracy using simulated GBS, empirical RADSeq and amplicon sequencing data from Chinook salmon. We accurately infer allele dosage for some paralogs from a RADSeq data set and show how accuracy is dependent upon both read depth and allele frequency. The amplicon sequencing data set, using RADSeq‐derived markers, achieved sufficient depth to infer allele dosage for all paralogs. This study demonstrates that RADSeq locus discovery combined with amplicon sequencing of targeted loci is an effective method for incorporating paralogs into population genetic analyses. 相似文献
9.
10.
Anna Barbanti Hector Torrado Enrique Macpherson Luca Bargelloni Rafaella Franch Carlos Carreras Marta Pascual 《Molecular ecology resources》2020,20(3):795-806
High‐throughput sequencing has revolutionized population and conservation genetics. RAD sequencing methods, such as 2b‐RAD, can be used on species lacking a reference genome. However, transferring protocols across taxa can potentially lead to poor results. We tested two different IIB enzymes (AlfI and CspCI) on two species with different genome sizes (the loggerhead turtle Caretta caretta and the sharpsnout seabream Diplodus puntazzo) to build a set of guidelines to improve 2b‐RAD protocols on non‐model organisms while optimising costs. Good results were obtained even with degraded samples, showing the value of 2b‐RAD in studies with poor DNA quality. However, library quality was found to be a critical parameter on the number of reads and loci obtained for genotyping. Resampling analyses with different number of reads per individual showed a trade‐off between number of loci and number of reads per sample. The resulting accumulation curves can be used as a tool to calculate the number of sequences per individual needed to reach a mean depth ≥20 reads to acquire good genotyping results. Finally, we demonstrated that selective‐base ligation does not affect genomic differentiation between individuals, indicating that this technique can be used in species with large genome sizes to adjust the number of loci to the study scope, to reduce sequencing costs and to maintain suitable sequencing depth for a reliable genotyping without compromising the results. Here, we provide a set of guidelines to improve 2b‐RAD protocols on non‐model organisms with different genome sizes, helping decision‐making for a reliable and cost‐effective genotyping. 相似文献
11.
Zhengxiao Zhai Wenjing Zhao Chuan He Kaixuan Yang Linlin Tang Shuyun Liu Yan Zhang Qizhong Huang He Meng 《Animal genetics》2015,46(2):216-219
Single nucleotide polymorphisms (SNPs) are essential to the understanding of population genetic variation and diversity. Here, we performed restriction‐site‐associated DNA sequencing (RAD‐seq) on 72 individuals from 13 Chinese indigenous and three introduced chicken breeds. A total of 620 million reads were obtained using an Illumina Hiseq2000 sequencer. An average of 75 587 SNPs were identified from each individual. Further filtering strictly validated 28 895 SNPs candidates for all populations. When compared with the NCBI dbSNP (chicken_9031), 15 404 SNPs were new discoveries. In this study, RAD‐seq was performed for the first time on chickens, implicating the remarkable effectiveness and potential applications on genetic analysis and breeding technique for whole‐genome selection in chicken and other agricultural animals. 相似文献
12.
Genotyping‐by‐sequencing approaches to characterize crop genomes: choosing the right tool for the right application 下载免费PDF全文
In the last decade, the revolution in sequencing technologies has deeply impacted crop genotyping practice. New methods allowing rapid, high‐throughput genotyping of entire crop populations have proliferated and opened the door to wider use of molecular tools in plant breeding. These new genotyping‐by‐sequencing (GBS) methods include over a dozen reduced‐representation sequencing (RRS) approaches and at least four whole‐genome resequencing (WGR) approaches. The diversity of methods available, each often producing different types of data at different cost, can make selection of the best‐suited method seem a daunting task. We review the most common genotyping methods used today and compare their suitability for linkage mapping, genomewide association studies (GWAS), marker‐assisted and genomic selection and genome assembly and improvement in crops with various genome sizes and complexity. Furthermore, we give an outline of bioinformatics tools for analysis of genotyping data. WGR is well suited to genotyping biparental cross populations with complex, small‐ to moderate‐sized genomes and provides the lowest cost per marker data point. RRS approaches differ in their suitability for various tasks, but demonstrate similar costs per marker data point. These approaches are generally better suited for de novo applications and more cost‐effective when genotyping populations with large genomes or high heterozygosity. We expect that although RRS approaches will remain the most cost‐effective for some time, WGR will become more widespread for crop genotyping as sequencing costs continue to decrease. 相似文献
13.
Ploidy levels sometimes vary among individuals or populations, particularly in plants. When such variation exists, accurate determination of cytotype can inform studies of ecology or trait variation and is required for population genetic analyses. Here, we propose and evaluate a statistical approach for distinguishing low‐level ploidy variants (e.g. diploids, triploids and tetraploids) based on genotyping‐by‐sequencing (GBS) data. The method infers cytotypes based on observed heterozygosity and the ratio of DNA sequences containing different alleles at thousands of heterozygous SNPs (i.e. allelic ratios). Whereas the method does not require prior information on ploidy, a reference set of samples with known ploidy can be included in the analysis if it is available. We explore the power and limitations of this method using simulated data sets and GBS data from natural populations of aspen (Populus tremuloides) known to include both diploid and triploid individuals. The proposed method was able to reliably discriminate among diploids, triploids and tetraploids in simulated data sets, and this was true for different levels of genetic diversity, inbreeding and population structure. Power and accuracy were minimally affected by low coverage (i.e. 2×), but did sometimes suffer when simulated mixtures of diploids, autotetraploids and allotetraploids were analysed. Cytotype assignments based on the proposed method closely matched those from previous microsatellite and flow cytometry data when applied to GBS data from aspen. An R package (gbs2ploidy) implementing the proposed method is available from CRAN. 相似文献
14.
Neele Wendler Martin Mascher Christiane Nöh Axel Himmelbach Uwe Scholz Brigitte Ruge‐Wehling Nils Stein 《Plant biotechnology journal》2014,12(8):1122-1131
Crop wild relatives (CWR) provide an important source of allelic diversity for any given crop plant species for counteracting the erosion of genetic diversity caused by domestication and elite breeding bottlenecks. Hordeum bulbosum L. is representing the secondary gene pool of the genus Hordeum. It has been used as a source of genetic introgressions for improving elite barley germplasm (Hordeum vulgare L.). However, genetic introgressions from H. bulbosum have yet not been broadly applied, due to a lack of suitable molecular tools for locating, characterizing, and decreasing by recombination and marker‐assisted backcrossing the size of introgressed segments. We applied next‐generation sequencing (NGS) based strategies for unlocking genetic diversity of three diploid introgression lines of cultivated barley containing chromosomal segments of its close relative H. bulbosum. Firstly, exome capture‐based (re)‐sequencing revealed large numbers of single nucleotide polymorphisms (SNPs) enabling the precise allocation of H. bulbosum introgressions. This SNP resource was further exploited by designing a custom multiplex SNP genotyping assay. Secondly, two‐enzyme‐based genotyping‐by‐sequencing (GBS) was employed to allocate the introgressed H. bulbosum segments and to genotype a mapping population. Both methods provided fast and reliable detection and mapping of the introgressed segments and enabled the identification of recombinant plants. Thus, the utilization of H. bulbosum as a resource of natural genetic diversity in barley crop improvement will be greatly facilitated by these tools in the future. 相似文献
15.
Characterization of MHC class II B polymorphism in multiple populations of wild gorillas using non‐invasive samples and next‐generation sequencing 下载免费PDF全文
Jörg B. Hans Anne Haubner Mimi Arandjelovic Richard A. Bergl Tillmann Fünfstück Maryke Gray David B. Morgan Martha M. Robbins Crickette Sanz Linda Vigilant 《American journal of primatology》2015,77(11):1193-1206
16.
Crawford Drury Rocío Prez Portela Xaymara M. Serrano Marjorie Oleksiak Andrew C. Baker 《Ecology and evolution》2020,10(12):6009-6019
Mesophotic reefs (30‐150 m) have been proposed as potential refugia that facilitate the recovery of degraded shallow reefs following acute disturbances such as coral bleaching and disease. However, because of the technical difficulty of collecting samples, the connectivity of adjacent mesophotic reefs is relatively unknown compared with shallower counterparts. We used genotyping by sequencing to assess fine‐scale genetic structure of Montastraea cavernosa at two sites at Pulley Ridge, a mesophotic coral reef ecosystem in the Gulf of Mexico, and downstream sites along the Florida Reef Tract. We found differentiation between reefs at Pulley Ridge (~68 m) and corals at downstream upper mesophotic depths in the Dry Tortugas (28–36 m) and shallow reefs in the northern Florida Keys (Key Biscayne, ~5 m). The spatial endpoints of our study were distinct, with the Dry Tortugas as a genetic intermediate. Most striking were differences in population structure among northern and southern sites at Pulley Ridge that were separated by just 12km. Unique patterns of clonality and outlier loci allele frequency support these sites as different populations and suggest that the long‐distance horizontal connectivity typical of shallow‐water corals may not be typical for mesophotic systems in Florida and the Gulf of Mexico. We hypothesize that this may be due to the spawning of buoyant gametes, which commits propagules to the surface, resulting in greater dispersal and lower connectivity than typically found between nearby shallow sites. Differences in population structure over small spatial scales suggest that demographic constraints and/or environmental disturbances may be more variable in space and time on mesophotic reefs compared with their shallow‐water counterparts. 相似文献
17.
Farah Bendaoud Gunjune Kim Hailey Larose James H. Westwood Nadjia Zermane David C. Haak 《Ecology and evolution》2022,12(3)
Crenate broomrape (Orobanche crenata Forsk.) is a serious long‐standing parasitic weed problem in Algeria, mainly affecting legumes but also vegetable crops. Unresolved questions for parasitic weeds revolve around the extent to which these plants undergo local adaptation, especially with respect to host specialization, which would be expected to be a strong selective factor for obligate parasitic plants. In the present study, the genotyping‐by‐sequencing (GBS) approach was used to analyze genetic diversity and population structure of 10 Northern Algerian O. crenata populations with different geographical origins and host species (faba bean, pea, chickpea, carrot, and tomato). In total, 8004 high‐quality single‐nucleotide polymorphisms (5% missingness) were obtained and used across the study. Genetic diversity and relationships of 95 individuals from 10 populations were studied using model‐based ancestry analysis, principal components analysis, discriminant analysis of principal components, and phylogeny approaches. The genetic differentiation (F ST) between pairs of populations was lower between adjacent populations and higher between geographically separated ones, but no support was found for isolation by distance. Further analyses identified four genetic clusters and revealed evidence of structuring among populations and, although confounded with location, among hosts. In the clearest example, O. crenata growing on pea had a SNP profile that was distinct from other host/location combinations. These results illustrate the importance and potential of GBS to reveal the dynamics of parasitic weed dispersal and population structure. 相似文献
18.
19.
Heritability estimates from genomewide relatedness matrices in wild populations: Application to a passerine,using a small sample size 下载免费PDF全文
Genomic developments have empowered the investigation of heritability in wild populations directly from genomewide relatedness matrices (GRM). Such GRM‐based approaches can in particular be used to improve or substitute approaches based on social pedigree (PED‐social). However, measuring heritability from GRM in the wild has not been widely applied yet, especially using small samples and in nonmodel species. Here, we estimated heritability for four quantitative traits (tarsus length, wing length, bill length and body mass), using PED‐social, a pedigree corrected by genetic data (PED‐corrected) and a GRM from a small sample (n = 494) of blue tits from natural populations in Corsica genotyped at nearly 50,000 filtered SNPs derived from RAD‐seq. We also measured genetic correlations among traits, and we performed chromosome partitioning. Heritability estimates were slightly higher when using GRM compared to PED‐social, and PED‐corrected yielded intermediate values, suggesting a minor underestimation of heritability in PED‐social due to incorrect pedigree links, including extra‐pair paternity, and to lower information content than the GRM. Genetic correlations among traits were similar between PED‐social and GRM but credible intervals were very large in both cases, suggesting a lack of power for this small data set. Although a positive linear relationship was found between the number of genes per chromosome and the chromosome heritability for tarsus length, chromosome partitioning similarly showed a lack of power for the three other traits. We discuss the usefulness and limitations of the quantitative genetic inferences based on genomic data in small samples from wild populations. 相似文献
20.
X.‐L. Wu J. Xu H. Li R. Ferretti J. He J. Qiu Q. Xiao B. Simpson T. Michell S. D. Kachman R. G. Tait S. Bauck 《Animal genetics》2019,50(4):367-371
SNP arrays are widely used in genetic research and agricultural genomics applications, and the quality of SNP genotyping data is of paramount importance. In the present study, SNP genotyping concordance and discordance were evaluated for commercial bovine SNP arrays based on two types of quality assurance (QA) samples provided by Neogen GeneSeek. The genotyping discordance rates (GDRs) between chips were on average between 0.06% and 0.37% based on the QA type I data and between 0.05% and 0.15% based on the QA type II data. The average genotyping error rate (GER) pertaining to single SNP chips, based on the QA type II data, varied between 0.02% and 0.08% per SNP and between 0.01% and 0.06% per sample. These results indicate that genotyping concordance rate was high (i.e. from 99.63% to 99.99%). Nevertheless, mitochondrial and Y chromosome SNPs had considerably elevated GDRs and GERs compared to the SNPs on the 29 autosomes and X chromosome. The majority of genotyping errors resulted from single allotyping errors, which also included the opposite instances for allele ‘dropout’ (i.e. from AB to AA or BB). Simultaneous allotyping errors on both alleles (e.g. mistaking AA for BB or vice versa) were relatively rare. Finally, a list of SNPs with a GER greater than 1% is provided. Interpretation of association effects of these SNPs, for example in genome‐wide association studies, needs to be taken with caution. The genotyping concordance information needs to be considered in the optimal design of future bovine SNP arrays. 相似文献