首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In genome‐wide association studies, quality control (QC) of genotypes is important to avoid spurious results. It is also important to maintain long‐term data integrity, particularly in settings with ongoing genotyping (e.g. estimation of genomic breeding values). Here we discuss snpqc , a fully automated pipeline to perform QC analyses of Illumina SNP array data. It applies a wide range of common quality metrics with user‐defined filtering thresholds to generate a comprehensive QC report and a filtered dataset, including a genomic relationship matrix, ready for further downstream analyses which make it amenable for integration in high‐throughput environments. snpqc also builds a database to store genotypic, phenotypic and quality metrics to ensure data integrity and the option of integrating more samples from subsequent runs. The program is generic across species and array designs, providing a convenient interface between the genotyping laboratory and downstream genome‐wide association study or genomic prediction.  相似文献   

2.
We developed a 384 multiplexed SNP array, named CitSGA-1, for the genotyping of Citrus cultivars, and evaluated the performance and reliability of the genotyping. SNPs were surveyed by direct sequence comparison of the sequence tagged site (STS) fragment amplified from genomic DNA of cultivars representing the genetic diversity of citrus breeding in Japan. Among 1497 SNPs candidates, 384 SNPs for a high-throughput genotyping array were selected based on physical parameters of Illumina’s bead array criteria. The assay using CitSGA-1 was applied to a hybrid population of 88 progeny and 103 citrus accessions for breeding in Japan, which resulted in 73,726 SNP calls. A total of 351 SNPs (91 %) could call different genotypes among the DNA samples, resulting in a success rate for the assay comparable to previously reported rates for other plant species. To confirm the reliability of SNP genotype calls, parentage analysis was applied, and it indicated that the number of reliable SNPs and corresponding STSs were 276 and 213, respectively. The multiplexed SNP genotyping array reported here will be useful for the efficient construction of linkage map, for the detection of markers for marker-assisted breeding, and for the identification of cultivars.  相似文献   

3.

Background

High density genotyping data are indispensable for genomic analyses of complex traits in animal and crop species. Maize is one of the most important crop plants worldwide, however a high density SNP genotyping array for analysis of its large and highly dynamic genome was not available so far.

Results

We developed a high density maize SNP array composed of 616,201 variants (SNPs and small indels). Initially, 57 M variants were discovered by sequencing 30 representative temperate maize lines and then stringently filtered for sequence quality scores and predicted conversion performance on the array resulting in the selection of 1.2 M polymorphic variants assayed on two screening arrays. To identify high-confidence variants, 285 DNA samples from a broad genetic diversity panel of worldwide maize lines including the samples used for sequencing, important founder lines for European maize breeding, hybrids, and proprietary samples with European, US, semi-tropical, and tropical origin were used for experimental validation. We selected 616 k variants according to their performance during validation, support of genotype calls through sequencing data, and physical distribution for further analysis and for the design of the commercially available Affymetrix® Axiom® Maize Genotyping Array. This array is composed of 609,442 SNPs and 6,759 indels. Among these are 116,224 variants in coding regions and 45,655 SNPs of the Illumina® MaizeSNP50 BeadChip for study comparison. In a subset of 45,974 variants, apart from the target SNP additional off-target variants are detected, which show only a minor bias towards intermediate allele frequencies. We performed principal coordinate and admixture analyses to determine the ability of the array to detect and resolve population structure and investigated the extent of LD within a worldwide validation panel.

Conclusions

The high density Affymetrix® Axiom® Maize Genotyping Array is optimized for European and American temperate maize and was developed based on a diverse sample panel by applying stringent quality filter criteria to ensure its suitability for a broad range of applications. With 600 k variants it is the largest currently publically available genotyping array in crop species.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-823) contains supplementary material, which is available to authorized users.  相似文献   

4.
5.
Cultivated apple (Malus × domestica Borkh.) is one of the most important fruit crops in temperate regions, and has great economic and cultural value. The apple genome is highly heterozygous and has undergone a recent duplication which, combined with a rapid linkage disequilibrium decay, makes it difficult to perform genome‐wide association (GWA) studies. Single nucleotide polymorphism arrays offer highly multiplexed assays at a relatively low cost per data point and can be a valid tool for the identification of the markers associated with traits of interest. Here, we describe the development and validation of a 487K SNP Affymetrix Axiom® genotyping array for apple and discuss its potential applications. The array has been built from the high‐depth resequencing of 63 different cultivars covering most of the genetic diversity in cultivated apple. The SNPs were chosen by applying a focal points approach to enrich genic regions, but also to reach a uniform coverage of non‐genic regions. A total of 1324 apple accessions, including the 92 progenies of two mapping populations, have been genotyped with the Axiom®Apple480K to assess the effectiveness of the array. A large majority of SNPs (359 994 or 74%) fell in the stringent class of poly high resolution polymorphisms. We also devised a filtering procedure to identify a subset of 275K very robust markers that can be safely used for germplasm surveys in apple. The Axiom®Apple480K has now been commercially released both for public and proprietary use and will likely be a reference tool for GWA studies in apple.  相似文献   

6.
The successful application of genomic selection (GS) approaches is dependent on genetic makers derived from high-throughput and low-cost genotyping methods. Recent GS studies in trees have predominantly relied on SNP arrays as the source of genotyping, though this technology has a high entry cost. The recent development of alternative genotyping platforms, tailored to specific species and with low entry cost, has become possible due to advances in next-generation sequencing and genome complexity reduction methods such as sequence capture. However, the performance of these new platforms in GS models has not yet been evaluated, or compared to models developed from SNP arrays. Here, we evaluate the impact of these genotyping technologies on the development of GS prediction models for a Eucalyptus breeding population composed of 739 trees phenotyped for 13 wood quality and growth traits. Genotyping data obtained with both methods were compared for linkage disequilibrium, minor allele frequency, and missing data. Phenotypic prediction methods RR-BLUP and BayesB were employed, while predictive ability using cross validation was used to evaluate the performance of GS models derived from the different genotyping platforms. Differences in linkage disequilibrium patterns, minor allele frequency, missing data, and marker distribution were detected between sequence capture and SNP arrays. However, RR-BLUP and BayesB GS models resulted in similar predictive abilities. These results demonstrate that both genotyping methods are equivalent for genomic prediction of the traits evaluated. Sequence capture offers an alternative for species where SNP arrays are not available, or for when the initial development cost is too high.  相似文献   

7.
An integrated system for high throughput TaqMan based SNP genotyping.   总被引:5,自引:0,他引:5  
We have developed an integrated laboratory information system that allows the flexible handling of pedigree, phenotype and genotype information. Specifically, it includes client applications for an integrated data import from TaqMan typing files, Mendel checking, data export, handling of pedigree and phenotype information and analysis features. AVAILABILITY: The SQL source code, sources and binaries of the client applications (NT and Windows95/98 platforms) and additional documentation are available at http://www.mucosa.de/.  相似文献   

8.
Cultivated soybean (Glycine max) suffers from a narrow germplasm relative to other crop species, probably because of under‐use of wild soybean (Glycine soja) as a breeding resource. Use of a single nucleotide polymorphism (SNP) genotyping array is a promising method for dissecting cultivated and wild germplasms to identify important adaptive genes through high‐density genetic mapping and genome‐wide association studies. Here we describe a large soybean SNP array for use in diversity analyses, linkage mapping and genome‐wide association analyses. More than four million high‐quality SNPs identified from high‐depth genome re‐sequencing of 16 soybean accessions and low‐depth genome re‐sequencing of 31 soybean accessions were used to select 180 961 SNPs for creation of the Axiom® SoyaSNP array. Validation analysis for a set of 222 diverse soybean lines showed that 170 223 markers were of good quality for genotyping. Phylogenetic and allele frequency analyses of the validation set data indicated that accessions showing an intermediate morphology between cultivated and wild soybeans collected in Korea were natural hybrids. More than 90 unanchored scaffolds in the current soybean reference sequence were assigned to chromosomes using this array. Finally, dense average spacing and preferential distribution of the SNPs in gene‐rich chromosomal regions suggest that this array may be suitable for genome‐wide association studies of soybean germplasm. Taken together, these results suggest that use of this array may be a powerful method for soybean genetic analyses relating to many aspects of soybean breeding.  相似文献   

9.
In wheat, a lack of genetic diversity between breeding lines has been recognized as a significant block to future yield increases. Species belonging to bread wheat's secondary and tertiary gene pools harbour a much greater level of genetic variability, and are an important source of genes to broaden its genetic base. Introgression of novel genes from progenitors and related species has been widely employed to improve the agronomic characteristics of hexaploid wheat, but this approach has been hampered by a lack of markers that can be used to track introduced chromosome segments. Here, we describe the identification of a large number of single nucleotide polymorphisms that can be used to genotype hexaploid wheat and to identify and track introgressions from a variety of sources. We have validated these markers using an ultra‐high‐density Axiom® genotyping array to characterize a range of diploid, tetraploid and hexaploid wheat accessions and wheat relatives. To facilitate the use of these, both the markers and the associated sequence and genotype information have been made available through an interactive web site.  相似文献   

10.
This study provides a new version of the arrayed primer extension (APEX) protocol adapted to the 'array of arrays' platform using an instrumental setup for microarray processing not previously described. The primary aim of the study is to implement a system for rational cost-efficient genotyping where multiple singlenucleotide polymorphisms (SNPs) and individuals are genotyped on each microarray slide. Genotyping results are collected across 185 healthy Danish subjects and 76 SNPs on chromosome 3q13.31, because linkage to atopic disease phenotypes have been suggested in the Danish population. Linkage disequilibrium (LD) results from the experimental data are used in a novel comparison to baseline data defined by the international HapMap SNP database. Comparison on the LD results reveals a strong linear correlation irrespective of LD measure considered: R2 (D') = 0.73 and R2(r2) = 0.54. In conclusion, our results show that this setup is strong enough to support high-throughput genotyping, and these observations support that the HapMap genotype resource is important for defining SNP panels aimed at gene mapping in local subpopulations from Europe.  相似文献   

11.
Genes underlying repeated adaptive evolution in natural populations are still largely unknown. Stickleback fish (Gasterosteus aculeatus) have undergone a recent dramatic evolutionary radiation, generating numerous examples of marine-freshwater species pairs and a small number of benthic-limnetic species pairs found within single lakes [1]. We have developed a new genome-wide SNP genotyping array to study patterns of genetic variation in sticklebacks over a wide geographic range, and to scan the genome for regions that contribute to repeated evolution of marine-freshwater or benthic-limnetic species pairs. Surveying 34 global populations with 1,159 informative markers revealed substantial genetic variation, with predominant patterns reflecting demographic history and geographic structure. After correcting for geographic structure and filtering for neutral markers, we detected large repeated shifts in allele frequency at some loci, identifying both known and novel loci likely contributing to marine-freshwater and benthic-limnetic divergence. Several novel loci fall close to genes implicated in epithelial barrier or immune functions, which have likely changed as sticklebacks adapt to contrasting environments. Specific alleles differentiating sympatric benthic-limnetic species pairs are shared in nearby solitary populations, suggesting an allopatric origin for adaptive variants and selection pressures unrelated to sympatry in the initial formation of these classic vertebrate species pairs.  相似文献   

12.
13.
The HUGO Pan-Asian SNP consortium conducted the largest survey to date of human genetic diversity among Asians by sampling 1,719 unrelated individuals among 71 populations from China, India, Indonesia, Japan, Malaysia, the Philippines, Singapore, South Korea, Taiwan, and Thailand. We have constructed a database (PanSNPdb), which contains these data and various new analyses of them. PanSNPdb is a research resource in the analysis of the population structure of Asian peoples, including linkage disequilibrium patterns, haplotype distributions, and copy number variations. Furthermore, PanSNPdb provides an interactive comparison with other SNP and CNV databases, including HapMap3, JSNP, dbSNP and DGV and thus provides a comprehensive resource of human genetic diversity. The information is accessible via a widely accepted graphical interface used in many genetic variation databases. Unrestricted access to PanSNPdb and any associated files is available at: http://www4a.biotec.or.th/PASNP.  相似文献   

14.
Genotyping with large numbers of molecular markers is now an indispensable tool within plant genetics and breeding. Especially through the identification of large numbers of single nucleotide polymorphism (SNP) markers using the novel high-throughput sequencing technologies, it is now possible to reliably identify many thousands of SNPs at many different loci in a given plant genome. For a number of important crop plants, SNP markers are now being used to design genotyping arrays containing thousands of markers spread over the entire genome and to analyse large numbers of samples. In this article, we discuss aspects that should be considered during the design of such large genotyping arrays and the analysis of individuals. The fact that crop plants are also often autopolyploid or allopolyploid is given due consideration. Furthermore, we outline some potential applications of large genotyping arrays including high-density genetic mapping, characterization (fingerprinting) of genetic material and breeding-related aspects such as association studies and genomic selection.  相似文献   

15.

Key message

An innovative genotyping method designated as semi-thermal asymmetric reverse PCR (STARP) was developed for genotyping individual SNPs with improved accuracy, flexible throughputs, low operational costs, and high platform compatibility.

Abstract

Multiplex chip-based technology for genome-scale genotyping of single nucleotide polymorphisms (SNPs) has made great progress in the past two decades. However, PCR-based genotyping of individual SNPs still remains problematic in accuracy, throughput, simplicity, and/or operational costs as well as the compatibility with multiple platforms. Here, we report a novel SNP genotyping method designated semi-thermal asymmetric reverse PCR (STARP). In this method, genotyping assay was performed under unique PCR conditions using two universal priming element-adjustable primers (PEA-primers) and one group of three locus-specific primers: two asymmetrically modified allele-specific primers (AMAS-primers) and their common reverse primer. The two AMAS-primers each were substituted one base in different positions at their 3′ regions to significantly increase the amplification specificity of the two alleles and tailed at 5′ ends to provide priming sites for PEA-primers. The two PEA-primers were developed for common use in all genotyping assays to stringently target the PCR fragments generated by the two AMAS-primers with similar PCR efficiencies and for flexible detection using either gel-free fluorescence signals or gel-based size separation. The state-of-the-art primer design and unique PCR conditions endowed STARP with all the major advantages of high accuracy, flexible throughputs, simple assay design, low operational costs, and platform compatibility. In addition to SNPs, STARP can also be employed in genotyping of indels (insertion–deletion polymorphisms). As vast variations in DNA sequences are being unearthed by many genome sequencing projects and genotyping by sequencing, STARP will have wide applications across all biological organisms in agriculture, medicine, and forensics.
  相似文献   

16.
A wealth of genetic associations for cardiovascular and metabolic phenotypes in humans has been accumulating over the last decade, in particular a large number of loci derived from recent genome wide association studies (GWAS). True complex disease-associated loci often exert modest effects, so their delineation currently requires integration of diverse phenotypic data from large studies to ensure robust meta-analyses. We have designed a gene-centric 50 K single nucleotide polymorphism (SNP) array to assess potentially relevant loci across a range of cardiovascular, metabolic and inflammatory syndromes. The array utilizes a "cosmopolitan" tagging approach to capture the genetic diversity across approximately 2,000 loci in populations represented in the HapMap and SeattleSNPs projects. The array content is informed by GWAS of vascular and inflammatory disease, expression quantitative trait loci implicated in atherosclerosis, pathway based approaches and comprehensive literature searching. The custom flexibility of the array platform facilitated interrogation of loci at differing stringencies, according to a gene prioritization strategy that allows saturation of high priority loci with a greater density of markers than the existing GWAS tools, particularly in African HapMap samples. We also demonstrate that the IBC array can be used to complement GWAS, increasing coverage in high priority CVD-related loci across all major HapMap populations. DNA from over 200,000 extensively phenotyped individuals will be genotyped with this array with a significant portion of the generated data being released into the academic domain facilitating in silico replication attempts, analyses of rare variants and cross-cohort meta-analyses in diverse populations. These datasets will also facilitate more robust secondary analyses, such as explorations with alternative genetic models, epistasis and gene-environment interactions.  相似文献   

17.
Copy number variations (CNVs) are important forms of structural variation in human and animals and can be considered as a major genetic component of phenotypic diversity. Here we used the Illumina PorcineSNP60 BeadChip V2 and a DLY [Duroc × (Large White × Landrace)] commercial hybrid population to identify 272 CNVs belonging to 165 CNV regions (CNVRs), of which 66 are new. As CNVRs are specific to origin of population, our DLY-specific data is an important complementary to the existing CNV map in the pig genome. Eight CNVRs were selected for validation by quantitative real-time PCR (qRT-PCR) and the accurate rate was high (87.25%). Gene function analysis suggested that a common CNVR may play an important role in multiple traits, including growth rate and carcass quality.  相似文献   

18.
MOTIVATION: Modern strategies for mapping disease loci require efficient genotyping of a large number of known polymorphic sites in the genome. The sensitive and high-throughput nature of hybridization-based DNA microarray technology provides an ideal platform for such an application by interrogating up to hundreds of thousands of single nucleotide polymorphisms (SNPs) in a single assay. Similar to the development of expression arrays, these genotyping arrays pose many data analytic challenges that are often platform specific. Affymetrix SNP arrays, e.g. use multiple sets of short oligonucleotide probes for each known SNP, and require effective statistical methods to combine these probe intensities in order to generate reliable and accurate genotype calls. RESULTS: We developed an integrated multi-SNP, multi-array genotype calling algorithm for Affymetrix SNP arrays, MAMS, that combines single-array multi-SNP (SAMS) and multi-array, single-SNP (MASS) calls to improve the accuracy of genotype calls, without the need for training data or computation-intensive normalization procedures as in other multi-array methods. The algorithm uses resampling techniques and model-based clustering to derive single array based genotype calls, which are subsequently refined by competitive genotype calls based on (MASS) clustering. The resampling scheme caps computation for single-array analysis and hence is readily scalable, important in view of expanding numbers of SNPs per array. The MASS update is designed to improve calls for atypical SNPs, harboring allele-imbalanced binding affinities, that are difficult to genotype without information from other arrays. Using a publicly available data set of HapMap samples from Affymetrix, and independent calls by alternative genotyping methods from the HapMap project, we show that our approach performs competitively to existing methods. AVAILABILITY: R functions are available upon request from the authors.  相似文献   

19.

Background  

Illumina's Infinium SNP BeadChips are extensively used in both small and large-scale genetic studies. A fundamental step in any analysis is the processing of raw allele A and allele B intensities from each SNP into genotype calls (AA, AB, BB). Various algorithms which make use of different statistical models are available for this task. We compare four methods (GenCall, Illuminus, GenoSNP and CRLMM) on data where the true genotypes are known in advance and data from a recently published genome-wide association study.  相似文献   

20.
Since the beginning of the genomic era, the number of available single nucleotide polymorphism (SNP) arrays has grown considerably. In the bovine species alone, 11 SNP chips not completely covered by intellectual property are currently available, and the number is growing. Genomic/genotype data are not standardized, and this hampers its exchange and integration. In addition, software used for the analyses of these data usually requires not standard (i.e. case specific) input files which, considering the large amount of data to be handled, require at least some programming skills in their production. In this work, we describe a software toolkit for SNP array data management, imputation, genome‐wide association studies, population genetics and genomic selection. However, this toolkit does not solve the critical need for standardization of the genotypic data and software input files. It only highlights the chaotic situation each researcher has to face on a daily basis and gives some helpful advice on the currently available tools in order to navigate the SNP array data complexity.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号