共查询到20条相似文献,搜索用时 310 毫秒
1.
Estimation of single nucleotide polymorphism allele frequency in DNA pools by using Pyrosequencing 总被引:10,自引:0,他引:10
Positional cloning of genes underlying complex diseases, such as type 2 diabetes mellitus (T2DM), typically follows a two-tiered process in which a chromosomal region is first identified by genome-wide linkage scanning, followed by association analyses using densely spaced single nucleotide polymorphic markers to identify the causal variant(s). The success of genome-wide single nucleotide polymorphism (SNP) detection has resulted in a vast number of potential markers available for use in the construction of such dense SNP maps. However, the cost of genotyping large numbers of SNPs in appropriately sized samples is nearly prohibitive. We have explored pooled DNA genotyping as a means of identifying differences in allele frequency between pools of individuals with T2DM and unaffected controls by using Pyrosequencing technology. We found that allele frequencies in pooled DNA were strongly correlated with those in individuals (r=0.99, P<0.0001) across a wide range of allele frequencies (0.02-0.50). We further investigated the sensitivity of this method to detect allele frequency differences between contrived pools, also over a wide range of allele frequencies. We found that Pyrosequencing was able to detect an allele frequency difference of less than 2% between pools, indicating that this method may be sensitive enough for use in association studies involving complex diseases where a small difference in allele frequency between cases and controls is expected. 相似文献
2.
Downes K Barratt BJ Akan P Bumpstead SJ Taylor SD Clayton DG Deloukas P 《BioTechniques》2004,36(5):840-845
The estimation of single nucleotide polymorphism (SNP) allele frequency in pooled DNA samples has been proposed as a cost-effective approach to whole genome association studies. However, the key issue is the allele frequency window in which a genotyping method operates and provides a statistically reliable answer. We assessed the homogeneous mass extend assay and estimated the variance associated with each experimental stage. We report that a relationship between estimated allele frequency and variance might exist, suggesting that high statistical power can be retained at low, as well as high, allele frequencies. Assuming this relationship, the formation of subpools consisting of 100 samples retains an effective sample size greater than 70% of the true sample size, with a savings of 11-fold the cost of an individual genotyping study, regardless of allele frequency. 相似文献
3.
Individual genotyping of single nucleotide polymorphisms (SNPs) remains expensive, especially for linkage disequilibrium mapping strategies involving high-throughput SNP genotyping. On one hand, current methods may suit scientific and laboratory needs in regard to accuracy, reproducibility/robustness, and large-scale application. On the other hand, a cheaper and less time-consuming alternative to individual genotyping is the use of SNP allelefrequencies determined in DNA pools. We have developed an accurate and reproducible protocol for allele frequency determination using Pyrosequencing technology in large genomic DNA pools (374 individuals). The measured correlation (R2) in large DNA pools was 0.980. In the context of disease-associated SNPs studies, we compared the allele frequencies between the disease (e.g., type 2 diabetes and obesity) and control groups detected by either individual genotyping or Pyrosequencing of DNA pools. In large pools, the variation between the two methods was 1.5 +/- 0.9%. It may be concluded that the allele frequency determination protocol could reliably detect over 4% differences between populations. The method is economical in regard to amounts of DNA, PCR, and primer extension reagents required. Furthermore, it allows the rapid determination of allelefrequency differences in case/control groups for association studies and susceptibility gene discovery in complex diseases. 相似文献
4.
High-throughput genotyping of swine populations is a potentially efficient method for establishing animal lineage and identification of loci important to animal health and efficient pork production. Markers were developed based upon single nucleotide polymorphisms (SNPs), which are abundant and amenable to automated genotyping platforms. The focus of this research was SNP discovery in expressed porcine genes providing markers to develop the porcine/human comparative map. Locus specific amplification (LSA) and comparative sequencing were used to generate PCR products and allelic information from parents of a swine reference family. Discovery of 1650 SNPs in 403 amplicons and strategies for optimizing LSA-based SNP discovery using alternative methods of PCR primer design, data analysis, and germplasm selection that are applicable to other populations and species are described. These data were the first large-scale assessment of frequency and distribution of porcine SNPs. 相似文献
5.
Norton N Williams NM Williams HJ Spurlock G Kirov G Morris DW Hoogendoorn B Owen MJ O'Donovan MC 《Human genetics》2002,110(5):471-478
Detecting alleles that confer small increments in susceptibility to disease will require large-scale allelic association studies of single-nucleotide polymorphisms (SNPs) in candidate, or positional candidate, genes. However, current genotyping technologies are one to two orders of magnitude too expensive to permit the analysis of thousands of SNPs in large samples. We have developed and thoroughly validated a highly accurate protocol for SNP allele frequency estimation in DNA pools based upon the SNaPshot (Applied Biosystems) chemistry adaptation of primer extension. Using this assay, we were able to estimate the difference in allele frequencies between pooled cases and controls (Delta) with a mean error of 0.01. Moreover, when we genotyped seven different SNPs in a single multiplex reaction, the results were similar, with a mean error for Delta of 0.008. The assay performed well for alleles of low frequency alleles (f approximately 0.05) and was accurate even with relatively poor quality DNA template extracted from mouthwashes. Our assay conditions are generalisable, universal, robust and, therefore, for the first time, permit high-throughput association analysis at a realistic cost. 相似文献
6.
To evaluate the ability to use DNA pools with the Illumina Infinium genotyping platform, two sets of gradient pools were created using two pairs of highly inbred chicken lines. Replicate pools containing 0%, 10%, 20%, 40%, 60%, 80%, 90% and 100% of DNA from line A vs. B or line C vs. D were created, for a total of 28 pools. All pools were genotyped for 12 046 SNPs. Three frequency estimation methods proposed in the literature (standard, heterozygote‐corrected and normalized) were compared with three alternate methods proposed herein based on mean square error (MSE), bias and variance of estimated vs. true allele frequencies and the fit of regression of estimated on true frequencies. The three new methods had average square root MSE of 4.6%, 4.6% and 4.7% compared to 5.2%, 5.5% and 11.2% for the three literature methods. Average absolute biases of the literature methods were 2.4%, 2.7% and 8.2% compared to 2.4% for all new methods. Standard deviations of estimates were also smaller for the new methods, at 3.1%, 3.2% and 3.2% compared to 3.5%, 4.0% and 5.0% for previously reported methods. In conclusion, intensity data from the Illumina Infinium Assay can be efficiently used to estimate allele frequencies in pools, in particular using any of the new methods proposed herein. 相似文献
7.
PPC: an algorithm for accurate estimation of SNP allele frequencies in small equimolar pools of DNA using data from high density microarrays 总被引:1,自引:1,他引:1
下载免费PDF全文

Robust estimation of allele frequencies in pools of DNA has the potential to reduce genotyping costs and/or increase the number of individuals contributing to a study where hundreds of thousands of genetic markers need to be genotyped in very large populations sample sets, such as genome wide association studies. In order to make accurate allele frequency estimations from pooled samples a correction for unequal allele representation must be applied. We have developed the polynomial based probe specific correction (PPC) which is a novel correction algorithm for accurate estimation of allele frequencies in data from high-density microarrays. This algorithm was validated through comparison of allele frequencies from a set of 10 individually genotyped DNA's and frequencies estimated from pools of these 10 DNAs using GeneChip 10K Mapping Xba 131 arrays. Our results demonstrate that when using the PPC to correct for allelic biases the accuracy of the allele frequency estimates increases dramatically. 相似文献
8.
Nanoparticle-based detection and quantification of DNA with single nucleotide polymorphism (SNP) discrimination selectivity 总被引:1,自引:0,他引:1
Sequence-specific DNA detection is important in various biomedical applications such as gene expression profiling, disease diagnosis and treatment, drug discovery and forensic analysis. Here we report a gold nanoparticle-based method that allows DNA detection and quantification and is capable of single nucleotide polymorphism (SNP) discrimination. The precise quantification of single-stranded DNA is due to the formation of defined nanoparticle-DNA conjugate groupings in the presence of target/linker DNA. Conjugate groupings were characterized and quantified by gel electrophoresis. A linear correlation between the amount of target DNA and conjugate groupings was found. For SNP detection, single base mismatch discrimination was achieved for both the end- and center-base mismatch. The method described here may be useful for the development of a simple and quantitative DNA detection assay. 相似文献
9.
A facile, rapid, stable and sensitive approach for fluorescent detection of single nucleotide polymorphism (SNP) is designed based on DNA ligase reaction and π-stacking between the graphene and the nucleotide bases. In the presence of perfectly matched DNA, DNA ligase can catalyze the linkage of fluorescein amidite-labeled single-stranded DNA (ssDNA) and a phosphorylated ssDNA, and thus the formation of a stable duplex in high yield. However, the catalytic reaction cannot effectively carry out with one-base mismatched DNA target. In this case, we add graphene to the system in order to produce different quenching signals due to its different adsorption affinity for ssDNA and double-stranded DNA. Taking advantage of the unique surface property of graphene and the high discriminability of DNA ligase, the proposed protocol exhibits good performance in SNP genotyping. The results indicate that it is possible to accurately determine SNP with frequency as low as 2.6% within 40 min. Furthermore, the presented flexible strategy facilitates the development of other biosensing applications in the future. 相似文献
10.
Assessing allele frequencies of single nucleotide polymorphisms in DNA pools by pyrosequencing technology 总被引:9,自引:0,他引:9
Single nucleotide polymorphism (SNP) association studies searching for differences in allele frequencies between cases and controls have been widely used for genetic analysis. Individual genotyping is prohibitively expensive in large sample sizes. Pooling of samples provides the obvious advantage of higher throughput and lower cost. Here we report our results with the analysis of SNP allele frequencies in DNA pools using Pyrosequencing technology. For seven different SNPs, we observed a mean difference of 1.1 +/- 0.6% between allele frequencies determined in two different DNA pools (n = 150 cases and 150 controls) compared to individually genotyped samples. 相似文献
11.
SNP discovery and allele frequency estimation by deep sequencing of reduced representation libraries 总被引:2,自引:0,他引:2
Van Tassell CP Smith TP Matukumalli LK Taylor JF Schnabel RD Lawley CT Haudenschild CD Moore SS Warren WC Sonstegard TS 《Nature methods》2008,5(3):247-252
High-density single-nucleotide polymorphism (SNP) arrays have revolutionized the ability of genome-wide association studies to detect genomic regions harboring sequence variants that affect complex traits. Extensive numbers of validated SNPs with known allele frequencies are essential to construct genotyping assays with broad utility. We describe an economical, efficient, single-step method for SNP discovery, validation and characterization that uses deep sequencing of reduced representation libraries (RRLs) from specified target populations. Using nearly 50 million sequences generated on an Illumina Genome Analyzer from DNA of 66 cattle representing three populations, we identified 62,042 putative SNPs and predicted their allele frequencies. Genotype data for these 66 individuals validated 92% of 23,357 selected genome-wide SNPs, with a genotypic and sequence allele frequency correlation of r = 0.67. This approach for simultaneous de novo discovery of high-quality SNPs and population characterization of allele frequencies may be applied to any species with at least a partially sequenced genome. 相似文献
12.
PCR-SSCP patterns on non-degenerative PAGE revealed 6 amplicons in caprine GH exon-4 and 3 alleles A4, B4 and C4 were identified. In exon-5, six SSCP variants revealed three alleles A5, B5, and C5. Out of 54 AA sites of GH-4 coding region, six codons were polymorphic. At codon-6, nucleotide substitution of G/A resulted in to genotypes 6RR, 6HH and G/C into genotypes 6PP, 6RP. At codon-36, A/G nucleotide substitution resulted in to newer genotypes 36GG from that of 36DD in reference Genbank sample. At codon-54, C/T nucleotide substitution caused change of amino acid (AA) from arginine (R) to tryptophan (W) resulted into a new genotype of 54WW in comparison to 54RR of Genbank reference sample. In exon-5, out of 67 AA sites 8 codons were polymorphic, but the codons 14 and 60 were preponderant. At codon-14, A/G substitution resulted into 3 genotypes 14KK, 14EE and 14KE with frequency of 0.52, 0.38 and 0.10, respectively. At codon-60, G/C and G/A substitutions resulted in to 3 genotypes 60GG, 60RR and 60GR with frequencies of 0.48, 0.42 and 0.10, respectively. Synonymous mutations as compared to Genbank accession D00476.1 were present at codons 25, 31 and 62 in all the animals of Jakhrana goats. The high genetic variability in GH gene exon-4 and exon-5 may be useful in exploring their associations with milk and growth traits in goat for further genetic improvement. 相似文献
13.
Single nucleotide polymorphisms (SNPs) are single-base inheritable variations in a given and defined genetic location that occur in at least 1% of the population. SNPs are useful markers for genetic association studies in disease susceptibility or adverse drug reactions, in evolutionary studies and forensic science. Given the potential impact of SNPs, the biotechnology industry has focused on the development of high-throughput methods for SNP genotyping. Many highthroughput SNP genotyping technologies are currently available and many others are being patented recently. Each offers a unique combination of scale, accuracy, throughput and cost. In this review, we described some of the most important recent SNP genotyping methods and also recent patents associated with it. 相似文献
14.
CJ Huijsmans J Poodt J Damen JC van der Linden PH Savelkoul JF Pruijt M Hilbink MH Hermans 《PloS one》2012,7(7):e38362
During tumor development, loss of heterozygosity (LOH) often occurs. When LOH is preceded by an oncogene activating mutation, the mutant allele may be further potentiated if the wild-type allele is lost or inactivated. In myeloproliferative neoplasms (MPN) somatic acquisition of JAK2V617F may be followed by LOH resulting in loss of the wild type allele. The occurrence of LOH in MPN and other proliferative diseases may lead to a further potentiating the mutant allele and thereby increasing morbidity. A real time PCR based SNP profiling assay was developed and validated for LOH detection of the JAK2 region (JAK2LOH). Blood of a cohort of 12 JAK2V617F-positive patients (n=6 25-50% and n=6>50% JAK2V617F) and a cohort of 81 patients suspected of MPN was stored with EDTA and subsequently used for validation. To generate germ-line profiles, non-neoplastic formalin-fixed paraffin-embedded tissue from each patient was analyzed. Results of the SNP assay were compared to those of an established Short Tandem Repeat (STR) assay. Both assays revealed JAK2LOH in 1/6 patients with 25-50% JAK2V617F. In patients with >50% JAK2V617F, JAK2LOH was detected in 6/6 by the SNP assay and 5/6 patients by the STR assay. Of the 81 patients suspected of MPN, 18 patients carried JAK2V617F. Both the SNP and STR assay demonstrated the occurrence of JAK2LOH in 5 of them. In the 63 JAK2V617F-negative patients, no JAK2LOH was observed by SNP and STR analyses. The presented SNP assay reliably detects JAK2LOH and is a fast and easy to perform alternative for STR analyses. We therefore anticipate the SNP approach as a proof of principle for the development of LOH SNP-assays for other clinically relevant LOH loci. 相似文献
15.
We present a statistical framework for estimation and application of sample allele frequency spectra from New-Generation Sequencing (NGS) data. In this method, we first estimate the allele frequency spectrum using maximum likelihood. In contrast to previous methods, the likelihood function is calculated using a dynamic programming algorithm and numerically optimized using analytical derivatives. We then use a Bayesian method for estimating the sample allele frequency in a single site, and show how the method can be used for genotype calling and SNP calling. We also show how the method can be extended to various other cases including cases with deviations from Hardy-Weinberg equilibrium. We evaluate the statistical properties of the methods using simulations and by application to a real data set. 相似文献
16.
Highly cost-efficient genome-wide association studies using DNA pools and dense SNP arrays 总被引:3,自引:0,他引:3
下载免费PDF全文

Macgregor S Zhao ZZ Henders A Nicholas MG Montgomery GW Visscher PM 《Nucleic acids research》2008,36(6):e35
Genome-wide association (GWA) studies to map genes for complex traits are powerful yet costly. DNA-pooling strategies have the potential to dramatically reduce the cost of GWA studies. Pooling using Affymetrix arrays has been proposed and used but the efficiency of these arrays has not been quantified. We compared and contrasted Affymetrix Genechip HindIII and Illumina HumanHap300 arrays on the same DNA pools and showed that the HumanHap300 arrays are substantially more efficient. In terms of effective sample size, HumanHap300-based pooling extracts >80% of the information available with individual genotyping (IG). In contrast, Genechip HindIII-based pooling only extracts ~30% of the available information. With HumanHap300 arrays concordance with IG data is excellent. Guidance is given on best study design and it is shown that even after taking into account pooling error, one stage scans can be performed for >100-fold reduced cost compared with IG. With appropriately designed two stage studies, IG can provide confirmation of pooling results whilst still providing ~20-fold reduction in total cost compared with IG-based alternatives. The large cost savings with Illumina HumanHap300-based pooling imply that future studies need only be limited by the availability of samples and not cost. 相似文献
17.
Tsui C Coleman LE Griffith JL Bennett EA Goodson SG Scott JD Pittard WS Devine SE 《Nucleic acids research》2003,31(16):4910-4916
An international effort is underway to generate a comprehensive haplotype map (HapMap) of the human genome represented by an estimated 300000 to 1 million ‘tag’ single nucleotide polymorphisms (SNPs). Our analysis indicates that the current human SNP map is not sufficiently dense to support the HapMap project. For example, 24.6% of the genome currently lacks SNPs at the minimal density and spacing that would be required to construct even a conservative tag SNP map containing 300 000 SNPs. In an effort to improve the human SNP map, we identified 140 696 additional SNP candidates using a new bioinformatics pipeline. Over 51 000 of these SNPs mapped to the largest gaps in the human SNP map, leading to significant improvements in these regions. Our SNPs will be immediately useful for the HapMap project, and will allow for the inclusion of many additional genomic intervals in the final HapMap. Nevertheless, our results also indicate that additional SNP discovery projects will be required both to define the haplotype architecture of the human genome and to construct comprehensive tag SNP maps that will be useful for genetic linkage studies in humans. 相似文献
18.
Single nucleotide polymorphisms (SNPs) are thought to be well suitable for genetic and evolutionary studies. In this study, we reported the first set of SNP markers in a commercially important crab species, Scylla paramamosain. A total of 12,500 base pairs high quality DNA sequences were obtained from 15 genes, and thirty-seven SNPs were identified, representing one SNP every 338 base pairs. Twenty-four SNPs were successfully genotyped in a single population. All loci had two alleles and the minor allele frequency ranged from 0.02 to 0.44. The observed and expected heterozygosity ranged from 0.04 to 0.59 and from 0.04 to 0.50, respectively. No significant departures from Hardy–Weinberg equilibrium at each locus was found. The linkage disequilibrium was detected in six loci pairs, but absent after sequential Bonferroni correction. These SNP markers will provide a useful addition to the genetic tools for genetic and evolutionary studies for S. paramamosain. 相似文献
19.
Kirkpatrick B Armendariz CS Karp RM Halperin E 《Bioinformatics (Oxford, England)》2007,23(22):3048-3055
MOTIVATION: The search for genetic variants that are linked to complex diseases such as cancer, Parkinson's;, or Alzheimer's; disease, may lead to better treatments. Since haplotypes can serve as proxies for hidden variants, one method of finding the linked variants is to look for case-control associations between the haplotypes and disease. Finding these associations requires a high-quality estimation of the haplotype frequencies in the population. To this end, we present, HaploPool, a method of estimating haplotype frequencies from blocks of consecutive SNPs. RESULTS: HaploPool leverages the efficiency of DNA pools and estimates the population haplotype frequencies from pools of disjoint sets, each containing two or three unrelated individuals. We study the trade-off between pooling efficiency and accuracy of haplotype frequency estimates. For a fixed genotyping budget, HaploPool performs favorably on pools of two individuals as compared with a state-of-the-art non-pooled phasing method, PHASE. Of independent interest, HaploPool can be used to phase non-pooled genotype data with an accuracy approaching that of PHASE. We compared our algorithm to three programs that estimate haplotype frequencies from pooled data. HaploPool is an order of magnitude more efficient (at least six times faster), and considerably more accurate than previous methods. In contrast to previous methods, HaploPool performs well with missing data, genotyping errors and long haplotype blocks (of between 5 and 25 SNPs). 相似文献
20.
Li-Sucholeiki XC Tomita-Mitchell A Arnold K Glassner BJ Thompson T Murthy JV Berk L Lange C Leong-Morgenthaler PM MacDougall D Munro J Cannon D Mistry T Miller A Deka C Karger B Gillespie KM Ekstrøm PO Todd JA Thilly WG 《Mutation research》2005,570(2):267-280
DNA variants underlying the inheritance of risk for common diseases are expected to have a wide range of population allele frequencies. The detection and scoring of the rare alleles (at frequencies of <0.01) presents significant practical problems, including the requirement for large sample sizes and the limitations inherent in current methodologies for allele discrimination. In the present report, we have applied mutational spectrometry based on constant denaturing capillary electrophoresis (CDCE) to DNA pools from large populations in order to improve the prospects of testing the role of rare variants in common diseases on a large scale. We conducted a pilot study of the cytotoxic T lymphocyte-associated antigen-4 gene (CTLA4) in type 1 diabetes (T1D). A total of 1228 bp, comprising 98% of the CTLA4 coding sequence, all adjacent intronic mRNA splice sites, and a 3′ UTR sequence were scanned for unknown point mutations in pools of genomic DNA from a control population of 10,464 young American adults and two T1D populations, one American (1799 individuals) and one from the United Kingdom (2102 individuals). The data suggest that it is unlikely that rare variants in the scanned regions of CTLA4 represent a significant proportion of T1D risk and illustrate that CDCE-based mutational spectrometry of DNA pools offers a feasible and cost-effective means of testing the role of rare variants in susceptibility to common diseases. 相似文献