共查询到20条相似文献,搜索用时 15 毫秒
1.
Individual genotyping of single nucleotide polymorphisms (SNPs) remains expensive, especially for linkage disequilibrium mapping strategies involving high-throughput SNP genotyping. On one hand, current methods may suit scientific and laboratory needs in regard to accuracy, reproducibility/robustness, and large-scale application. On the other hand, a cheaper and less time-consuming alternative to individual genotyping is the use of SNP allelefrequencies determined in DNA pools. We have developed an accurate and reproducible protocol for allele frequency determination using Pyrosequencing technology in large genomic DNA pools (374 individuals). The measured correlation (R2) in large DNA pools was 0.980. In the context of disease-associated SNPs studies, we compared the allele frequencies between the disease (e.g., type 2 diabetes and obesity) and control groups detected by either individual genotyping or Pyrosequencing of DNA pools. In large pools, the variation between the two methods was 1.5 +/- 0.9%. It may be concluded that the allele frequency determination protocol could reliably detect over 4% differences between populations. The method is economical in regard to amounts of DNA, PCR, and primer extension reagents required. Furthermore, it allows the rapid determination of allelefrequency differences in case/control groups for association studies and susceptibility gene discovery in complex diseases. 相似文献
2.
Downes K Barratt BJ Akan P Bumpstead SJ Taylor SD Clayton DG Deloukas P 《BioTechniques》2004,36(5):840-845
The estimation of single nucleotide polymorphism (SNP) allele frequency in pooled DNA samples has been proposed as a cost-effective approach to whole genome association studies. However, the key issue is the allele frequency window in which a genotyping method operates and provides a statistically reliable answer. We assessed the homogeneous mass extend assay and estimated the variance associated with each experimental stage. We report that a relationship between estimated allele frequency and variance might exist, suggesting that high statistical power can be retained at low, as well as high, allele frequencies. Assuming this relationship, the formation of subpools consisting of 100 samples retains an effective sample size greater than 70% of the true sample size, with a savings of 11-fold the cost of an individual genotyping study, regardless of allele frequency. 相似文献
3.
Identifying the genetic variation underlying complex disease requires analysis of many single nucleotide polymorphisms (SNPs) in a large number of samples. Several high-throughput SNP genotyping techniques are available; however, their cost promotes the use of association screening with pooled DNA. This protocol describes the estimation of SNP allele frequencies in pools of DNA using the quantitative sequencing method Pyrosequencing (PSQ). PSQ is a relatively recently described high-throughput method for genotyping, allele frequency estimation and DNA methylation analysis based on the detection of real-time pyrophosphate release during synthesis of the complementary strand to a PCR product. The protocol involves the following steps: (i) quantity and quality assessment of individual DNA samples; (ii) DNA pooling, which may be undertaken at the pre- or post-PCR stage; (iii) PCR amplification of PSQ template containing the variable sequence region of interest; and (iv) PSQ to determine the frequency of alleles at a particular SNP site. Once the quantity and quality of individual DNA samples has been assessed, the protocol usually requires a few days for setting up pre-PCR pools, depending on sample number. After PCR amplification, preparation and analysis of PCR amplicon by PSQ takes 1 h per plate. 相似文献
4.
To evaluate the ability to use DNA pools with the Illumina Infinium genotyping platform, two sets of gradient pools were created using two pairs of highly inbred chicken lines. Replicate pools containing 0%, 10%, 20%, 40%, 60%, 80%, 90% and 100% of DNA from line A vs. B or line C vs. D were created, for a total of 28 pools. All pools were genotyped for 12 046 SNPs. Three frequency estimation methods proposed in the literature (standard, heterozygote‐corrected and normalized) were compared with three alternate methods proposed herein based on mean square error (MSE), bias and variance of estimated vs. true allele frequencies and the fit of regression of estimated on true frequencies. The three new methods had average square root MSE of 4.6%, 4.6% and 4.7% compared to 5.2%, 5.5% and 11.2% for the three literature methods. Average absolute biases of the literature methods were 2.4%, 2.7% and 8.2% compared to 2.4% for all new methods. Standard deviations of estimates were also smaller for the new methods, at 3.1%, 3.2% and 3.2% compared to 3.5%, 4.0% and 5.0% for previously reported methods. In conclusion, intensity data from the Illumina Infinium Assay can be efficiently used to estimate allele frequencies in pools, in particular using any of the new methods proposed herein. 相似文献
5.
Estimation of single nucleotide polymorphism allele frequency in DNA pools by using Pyrosequencing 总被引:10,自引:0,他引:10
Positional cloning of genes underlying complex diseases, such as type 2 diabetes mellitus (T2DM), typically follows a two-tiered process in which a chromosomal region is first identified by genome-wide linkage scanning, followed by association analyses using densely spaced single nucleotide polymorphic markers to identify the causal variant(s). The success of genome-wide single nucleotide polymorphism (SNP) detection has resulted in a vast number of potential markers available for use in the construction of such dense SNP maps. However, the cost of genotyping large numbers of SNPs in appropriately sized samples is nearly prohibitive. We have explored pooled DNA genotyping as a means of identifying differences in allele frequency between pools of individuals with T2DM and unaffected controls by using Pyrosequencing technology. We found that allele frequencies in pooled DNA were strongly correlated with those in individuals (r=0.99, P<0.0001) across a wide range of allele frequencies (0.02-0.50). We further investigated the sensitivity of this method to detect allele frequency differences between contrived pools, also over a wide range of allele frequencies. We found that Pyrosequencing was able to detect an allele frequency difference of less than 2% between pools, indicating that this method may be sensitive enough for use in association studies involving complex diseases where a small difference in allele frequency between cases and controls is expected. 相似文献
6.
Strategies for identifying genetic risk factors in complex diseases by association studies require the comparison of allele frequencies of numerous SNPs between affected and control populations. Theoretically, hundreds of thousands of SNP markers across the genome will have to be genotyped in these studies. Genotyping SNPs one sample at a time is extremely costly and time consuming. To streamline whole genome association studies, some have proposed to screen SNPs by pooling the DNA samples initially for allele frequency determination and perform individual genotyping only when there is a significant discrepancy in allele frequencies between the affected and control populations. Here we describe a new method for determining the allele frequency of SNPs in pooled DNA samples using a two-color primer extension assay with real-time monitoring of fluorescence polarization (named kinetic FP-TDI assay). By comparing the ratio of the rate of incorporation of the two allele-specific dye-terminators, one can calculate the relative amounts of each allele in the pooled sample. The accuracy of allele frequency determination with pooled samples is within 3.3 +/- 0.8% of that determined by genotyping individual samples that make up the pool. 相似文献
7.
Background
DNA pooling is a technique to reduce genotyping effort while incurring only minor losses in accuracy of allele frequency estimates for single nucleotide polymorphism (SNP) markers. 相似文献8.
We present a statistical framework for estimation and application of sample allele frequency spectra from New-Generation Sequencing (NGS) data. In this method, we first estimate the allele frequency spectrum using maximum likelihood. In contrast to previous methods, the likelihood function is calculated using a dynamic programming algorithm and numerically optimized using analytical derivatives. We then use a Bayesian method for estimating the sample allele frequency in a single site, and show how the method can be used for genotype calling and SNP calling. We also show how the method can be extended to various other cases including cases with deviations from Hardy-Weinberg equilibrium. We evaluate the statistical properties of the methods using simulations and by application to a real data set. 相似文献
9.
PPC: an algorithm for accurate estimation of SNP allele frequencies in small equimolar pools of DNA using data from high density microarrays 总被引:1,自引:1,他引:1
下载免费PDF全文

Robust estimation of allele frequencies in pools of DNA has the potential to reduce genotyping costs and/or increase the number of individuals contributing to a study where hundreds of thousands of genetic markers need to be genotyped in very large populations sample sets, such as genome wide association studies. In order to make accurate allele frequency estimations from pooled samples a correction for unequal allele representation must be applied. We have developed the polynomial based probe specific correction (PPC) which is a novel correction algorithm for accurate estimation of allele frequencies in data from high-density microarrays. This algorithm was validated through comparison of allele frequencies from a set of 10 individually genotyped DNA's and frequencies estimated from pools of these 10 DNAs using GeneChip 10K Mapping Xba 131 arrays. Our results demonstrate that when using the PPC to correct for allelic biases the accuracy of the allele frequency estimates increases dramatically. 相似文献
10.
Highly cost-efficient genome-wide association studies using DNA pools and dense SNP arrays 总被引:3,自引:0,他引:3
下载免费PDF全文

Macgregor S Zhao ZZ Henders A Nicholas MG Montgomery GW Visscher PM 《Nucleic acids research》2008,36(6):e35
Genome-wide association (GWA) studies to map genes for complex traits are powerful yet costly. DNA-pooling strategies have the potential to dramatically reduce the cost of GWA studies. Pooling using Affymetrix arrays has been proposed and used but the efficiency of these arrays has not been quantified. We compared and contrasted Affymetrix Genechip HindIII and Illumina HumanHap300 arrays on the same DNA pools and showed that the HumanHap300 arrays are substantially more efficient. In terms of effective sample size, HumanHap300-based pooling extracts >80% of the information available with individual genotyping (IG). In contrast, Genechip HindIII-based pooling only extracts ~30% of the available information. With HumanHap300 arrays concordance with IG data is excellent. Guidance is given on best study design and it is shown that even after taking into account pooling error, one stage scans can be performed for >100-fold reduced cost compared with IG. With appropriately designed two stage studies, IG can provide confirmation of pooling results whilst still providing ~20-fold reduction in total cost compared with IG-based alternatives. The large cost savings with Illumina HumanHap300-based pooling imply that future studies need only be limited by the availability of samples and not cost. 相似文献
11.
SNP discovery and allele frequency estimation by deep sequencing of reduced representation libraries 总被引:2,自引:0,他引:2
Van Tassell CP Smith TP Matukumalli LK Taylor JF Schnabel RD Lawley CT Haudenschild CD Moore SS Warren WC Sonstegard TS 《Nature methods》2008,5(3):247-252
High-density single-nucleotide polymorphism (SNP) arrays have revolutionized the ability of genome-wide association studies to detect genomic regions harboring sequence variants that affect complex traits. Extensive numbers of validated SNPs with known allele frequencies are essential to construct genotyping assays with broad utility. We describe an economical, efficient, single-step method for SNP discovery, validation and characterization that uses deep sequencing of reduced representation libraries (RRLs) from specified target populations. Using nearly 50 million sequences generated on an Illumina Genome Analyzer from DNA of 66 cattle representing three populations, we identified 62,042 putative SNPs and predicted their allele frequencies. Genotype data for these 66 individuals validated 92% of 23,357 selected genome-wide SNPs, with a genotypic and sequence allele frequency correlation of r = 0.67. This approach for simultaneous de novo discovery of high-quality SNPs and population characterization of allele frequencies may be applied to any species with at least a partially sequenced genome. 相似文献
12.
Assessing allele frequencies of single nucleotide polymorphisms in DNA pools by pyrosequencing technology 总被引:9,自引:0,他引:9
Single nucleotide polymorphism (SNP) association studies searching for differences in allele frequencies between cases and controls have been widely used for genetic analysis. Individual genotyping is prohibitively expensive in large sample sizes. Pooling of samples provides the obvious advantage of higher throughput and lower cost. Here we report our results with the analysis of SNP allele frequencies in DNA pools using Pyrosequencing technology. For seven different SNPs, we observed a mean difference of 1.1 +/- 0.6% between allele frequencies determined in two different DNA pools (n = 150 cases and 150 controls) compared to individually genotyped samples. 相似文献
13.
A method is described on the basis of a modification of the granddaughter design to obtain estimates of quantitative trait loci (QTL) allele frequencies in dairy cattle populations and to determine QTL genotypes for both homozygous and heterozygous grandsires. The method is based on determining the QTL allele passed from grandsires to their maternal granddaughters using haplotypes consisting of several closely linked genetic markers. This method was applied to simulated data of 10 grandsire families, each with 500 granddaughters, and a QTL with a substitution effect of 0.4 phenotypic standard deviations and to actual data for a previously analyzed QTL in the center of chromosome 6, with substitution effect of 1 phenotypic standard deviation on protein percentage. In the simulated data the standard error for the estimated QTL substitution effect with four closely linked multiallelic markers was only 7% greater than the expected standard error with completely correct identification of QTL allele origin. The method estimated the population QTL allelic frequency as 0.64 +/- 0.07, compared to the simulated value of 0.7. In the actual data, the frequency of the allele that increases protein percentage was estimated as 0.63 +/- 0.06. In both data sets the hypothesis of equal allelic frequencies was rejected at P < 0.05. 相似文献
14.
Cao P Wang QJ Zhu XT Zhou H Li R Wang WP 《Journal of chromatography. B, Analytical technologies in the biomedical and life sciences》2011,879(7-8):527-532
Quantitative determination of the allele frequency of single-nucleotide polymorphism (SNP) in pooled DNA samples is a promising approach to clarify the relationships between SNPs and diseases. Here, we present such a simple, accurate, and inexpensive method for quantitative determining the allele frequency in pooled DNA samples. Three steps of DNA pooling, PCR amplification and sequencing are involved in this assay. Although direct determination of the allele frequency from the two allele-specific fluorescence intensities is possible, correction for differential response of alleles is important. We explored the effect of differential response of alleles on test statistics and provide a solution to this problem based on heterozygous fluorescence intensities. We demonstrate the accuracy and reliability of this assay on pooled DNA samples with pre-determined allele frequencies from 7.1% to 53.9%. The accuracy of allele frequency measurements is high, with a correlation coefficient of r2 = 0.997 between measured and known frequencies. We believe that by providing a means for SNP genotyping up to hundreds of samples simultaneously, inexpensively, and reproducibly, this method is a powerful strategy for detecting meaningful polymorphic differences in candidate gene association studies. 相似文献
15.
Kirkpatrick B Armendariz CS Karp RM Halperin E 《Bioinformatics (Oxford, England)》2007,23(22):3048-3055
MOTIVATION: The search for genetic variants that are linked to complex diseases such as cancer, Parkinson's;, or Alzheimer's; disease, may lead to better treatments. Since haplotypes can serve as proxies for hidden variants, one method of finding the linked variants is to look for case-control associations between the haplotypes and disease. Finding these associations requires a high-quality estimation of the haplotype frequencies in the population. To this end, we present, HaploPool, a method of estimating haplotype frequencies from blocks of consecutive SNPs. RESULTS: HaploPool leverages the efficiency of DNA pools and estimates the population haplotype frequencies from pools of disjoint sets, each containing two or three unrelated individuals. We study the trade-off between pooling efficiency and accuracy of haplotype frequency estimates. For a fixed genotyping budget, HaploPool performs favorably on pools of two individuals as compared with a state-of-the-art non-pooled phasing method, PHASE. Of independent interest, HaploPool can be used to phase non-pooled genotype data with an accuracy approaching that of PHASE. We compared our algorithm to three programs that estimate haplotype frequencies from pooled data. HaploPool is an order of magnitude more efficient (at least six times faster), and considerably more accurate than previous methods. In contrast to previous methods, HaploPool performs well with missing data, genotyping errors and long haplotype blocks (of between 5 and 25 SNPs). 相似文献
16.
Wilkening S Hemminki K Thirumaran RK Bermejo JL Bonn S Försti A Kumar R 《BioTechniques》2005,39(6):853-858
Determination of allele frequency in pooled DNA samples is a powerful and efficient tool for large-scale association studies. In this study, we tested and compared three PCR-based methods for accuracy, reproducibility, cost, and convenience. The methods compared were: (i) real-time PCR with allele-specific primers, (ii) real-time PCR with allele-specific TaqMan probes, and (iii) quantitative sequencing. Allele frequencies of three single nucleotide polymorphisms in three different genes were estimated from pooled DNA. The pools were made of genomic DNA samples from 96 cases with basal cell carcinoma of the skin and 96 healthy controls with known genotypes. In this study, the allele frequency estimation made by real-time PCR with allele-specific primers had the smallest median deviation (MD) from the real allele frequency with 1.12% (absolute percentage points) and was also the cheapest method. However; this method required the most time for optimization and showed the highest variation between replicates (SD = 6.47%). Quantitative sequencing, the simplest method, was found to have intermediate accuracies (MD = 1.44%, SD = 4.2%). Real-time PCR with TaqMan probes, a convenient but very expensive method, had an MD of 1.47% and the lowest variation between replicates (SD = 3.18%). 相似文献
17.
A simple and accurate method for determination of microsatellite total allele content differences between DNA pools 总被引:7,自引:0,他引:7
Collins HE Li H Inda SE Anderson J Laiho K Tuomilehto J Seldin MF 《Human genetics》2000,106(2):218-226
DNA pooling is a potential tool for the efficient analysis of the large numbers of samples and DNA markers that are necessary for genome-wide association studies. A simple accurate method for measuring total allele differences in comparisons between two pools containing large numbers of DNA samples is presented. This method compares relative peak height differences between electrophoretograms for each allele of a microsatellite. The method was evaluated by the analysis of 11 microsatellite markers and DNA pooled sample sizes of 50, 100, and 200 individual DNA samples from the same number of different subjects. Pools were created from previously individually genotyped subjects and constructed so that the pool comparisons would provide real total allele differences varying from 0% to 55%. Calculated pool differences were then compared with the real total allele differences determined by individual genotyping results. Together over 200 comparisons demonstrated a correlation coefficient of 0.96, which compared favorably with other previous methods of analysis. This method could provide a rapid screen for total allele differences of greater than 10%, a threshold that should be applicable to detecting low relative risk genes in common diseases. Therefore, these studies suggest that DNA pooling could be a useful tool in association studies for the determination of candidate regions for a range of complex genetic diseases. 相似文献
18.
We have developed a publicly accessible database (ALFRED, the ALlele FREquency Database) that catalogues allele frequency data for a wide range of population samples and DNA polymorphisms. This database is web-accessible through our laboratory (Kidd Lab) Web site: http://info.med.yale.edu/genetics/kkidd. ALFRED currently contains data on 60 populations and 156 genetic systems including single nucleotide polymorphisms (SNPs), short tandem repeat polymorphisms (STRPs), variable number of tandem repeats (VNTRs) and insertion-deletion polymorphisms. While data are not available for all population-DNA polymorphism combinations, over 2000 allele frequency tables have been entered. Our database is designed (i) to address our specific research requirements as well as broader scientific objectives; (ii) to allow researchers and interested educators to easily navigate and retrieve data of interest to them; and (iii) to integrate links to other related public databases such as dbSNP, GenBank and PubMed. 相似文献
19.
20.
ALFRED (the ALelle FREquency Database) is designed to store and disseminate frequencies of alleles at human polymorphic sites for multiple populations, primarily for the population genetics and molecular anthropology communities. Currently ALFRED has information on over 180 polymorphic sites for more than 70 populations. Since our initial release of the database we have focussed on increasing the quantity and quality of data, making reciprocal links between ALFRED and other related databases, and providing useful tools to make the data more comprehensible to the end user. ALFRED is accessible from the Kidd Lab home page (http://info.med.yale. edu/genetics/kkidd/) or from ALFRED directly (http://alfred.med.yale. edu/alfred/index.asp). 相似文献