首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Using striped bass (Morone saxatilis) and six multiplexed microsatellite markers, we evaluated procedures for estimating allele frequencies by pooling DNA from multiple individuals, a method suggested as cost-effective relative to individual genotyping. Using moment-based estimators, we estimated allele frequencies in experimental DNA pools and found that the three primary laboratory steps, DNA quantitation and pooling, PCR amplification, and electrophoresis, accounted for 23, 48, and 29%, respectively, of the technical variance of estimates in pools containing DNA from 2-24 individuals. Exact allele-frequency estimates could be made for pools of sizes 2-8, depending on the locus, by using an integer-valued estimator. Larger pools of size 12 and 24 tended to yield biased estimates; however, replicates of these estimates detected allele frequency differences among pools with different allelic compositions. We also derive an unbiased estimator of Hardy-Weinberg disequilibrium coefficients that uses multiple DNA pools and analyze the cost-efficiency of DNA pooling. DNA pooling yields the most potential cost savings when a large number of loci are employed using a large number of individuals, a situation becoming increasingly common as microsatellite loci are developed in increasing numbers of taxa.  相似文献   

2.
Chi XF  Lou XY  Yang MC  Shu QY 《Genetica》2009,135(3):267-281
We present a cost-effective DNA pooling strategy for fine mapping of a single Mendelian gene in controlled crosses. The theoretical argument suggests that it is potentially possible for a single-stage pooling approach to reduce the overall experimental expense considerably by balancing costs for genotyping and sample collection. Further, the genotyping burden can be reduced through multi-stage pooling. Numerical results are provided for practical guidelines. For example, the genotyping effort can be reduced to only a small fraction of that needed for individual genotyping at a small loss of estimation accuracy or at a cost of increasing sample sizes slightly when recombination rates are 0.5% or less. An optimal two-stage pooling scheme can reduce the amount of genotyping to 19.5%, 14.5% and 6.4% of individual genotyping efforts for identifying a gene within 1, 0.5, and 0.1 cM, respectively. Finally, we use a genetic data set for mapping the rice xl(t) gene to demonstrate the feasibility and efficiency of the DNA pooling strategy. Taken together, the results demonstrate that this DNA pooling strategy can greatly reduce the genotyping burden and the overall cost in fine mapping experiments.  相似文献   

3.
We describe a microarray experiment using the MCF-7 breast cancer cell line in two different experimental conditions for which the same number of independent pools as the number of individual samples was hybridized on Affymetrix GeneChips. Unexpectedly, when using individual samples, the number of probe sets found to be differentially expressed between treated and untreated cells was about three times greater than that found using pools. These findings indicate that pooling samples in microarray experiments where the biological variability is expected to be small might not be helpful and could even decrease one's ability to identify differentially expressed genes.  相似文献   

4.
Deconvolution of relationships between bacterial artificial chromosome (BAC) clones and genes is a crucial step in the selective sequencing of regions of interest in a genome. It often includes combinatorial pooling of unique probes obtained from the genes (unigenes), and screening of the BAC library using the pools in a hybridization experiment. Since several probes can hybridize to the same BAC, in order for the deconvolution to be achievable the pooling design has to be able to handle a large number of positives. As a consequence, smaller pools need to be designed, which in turn increases the number of hybridization experiments, possibly making the entire protocol unfeasible. We propose a new algorithm that is capable of producing high-accuracy deconvolution even in the presence of a weak pooling design, i.e. when pools are rather large. The algorithm compensates for the decrease of information in the hybridization data by taking advantage of a physical map of the BAC clones. We show that the right combination of combinatorial pooling and our algorithm not only dramatically reduces the number of pools required, but also successfully deconvolutes the BAC-gene relationships with almost perfect accuracy. Software is available on request from the first author.  相似文献   

5.
Zhao Y  Wang S 《Human heredity》2009,67(1):46-56
Study cost remains the major limiting factor for genome-wide association studies due to the necessity of genotyping a large number of SNPs for a large number of subjects. Both DNA pooling strategies and two-stage designs have been proposed to reduce genotyping costs. In this study, we propose a cost-effective, two-stage approach with a DNA pooling strategy. During stage I, all markers are evaluated on a subset of individuals using DNA pooling. The most promising set of markers is then evaluated with individual genotyping for all individuals during stage II. The goal is to determine the optimal parameters (pi(p)(sample ), the proportion of samples used during stage I with DNA pooling; and pi(p)(marker ), the proportion of markers evaluated during stage II with individual genotyping) that minimize the cost of a two-stage DNA pooling design while maintaining a desired overall significance level and achieving a level of power similar to that of a one-stage individual genotyping design. We considered the effects of three factors on optimal two-stage DNA pooling designs. Our results suggest that, under most scenarios considered, the optimal two-stage DNA pooling design may be much more cost-effective than the optimal two-stage individual genotyping design, which use individual genotyping during both stages.  相似文献   

6.
The success of genome-wide association studies (GWAS) to identify risk loci of complex diseases is now well-established. One persistent major hurdle is the cost of those studies, which make them beyond the reach of most research groups. Performing GWAS on pools of DNA samples may be an effective strategy to reduce the costs of these studies. In this study, we performed pooling-based GWAS with more than 550,000 SNPs in two case-control cohorts consisting of patients with Type II diabetes (T2DM) and with chronic rhinosinusitis (CRS). In the T2DM study, the results of the pooling experiment were compared to individual genotypes obtained from a previously published GWAS. TCF7L2 and HHEX SNPs associated with T2DM by the traditional GWAS were among the top ranked SNPs in the pooling experiment. This dataset was also used to refine the best strategy to correctly identify SNPs that will remain significant based on individual genotyping. In the CRS study, the top hits from the pooling-based GWAS located within ten kilobases of known genes were validated by individual genotyping of 1,536 SNPs. Forty-one percent (598 out of the 1,457 SNPs that passed quality control) were associated with CRS at a nominal P value of 0.05, confirming the potential of pooling-based GWAS to identify SNPs that differ in allele frequencies between two groups of subjects. Overall, our results demonstrate that a pooling experiment on high-density genotyping arrays can accurately determine the minor allelic frequency as compared to individual genotyping and produce a list of top ranked SNPs that captures genuine allelic differences between a group of cases and controls. The low cost associated with a pooling-based GWAS clearly justifies its use in screening for genetic determinants of complex diseases. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

7.
To reduce the costs of using the ELITEST-MVV, we explored the possibilities of sample pooling. Straight forward pooling applying the manufacturer's test conditions resulted in a significant loss of sensitivity. This was solved by using lower pre-dilutions of the samples than prescribed. Although an increase of background signal was encountered, discrimination between positive and negative samples was even better at pre-dilutions up to 12.5× as compared to the standard pre-dilution of 100×. This implied that pooling of up to eight samples was feasible. Receiver operating characteristic (ROC) analysis was used to determine the optimal cut-off value for the testing of pooled serum samples.

A model for cost-benefit analysis of pooling was applied which combines the economics of the technical performance of the modified assay and other additional cost factors connected with pooling such as hands-on time for composing the pools, expected seroprevalence in the test population, sample tracing and testing the individual samples of positive pools.

We concluded that pooling of samples was only feasible for monitoring SRLV-free accredited flocks because of their very low prevalence of infection. A pool consisting of five samples turned out to be the economical optimum although technically pool sizes of 10 samples were permitted.  相似文献   


8.
Molecular markers produced by next‐generation sequencing (NGS) technologies are revolutionizing genetic research. However, the costs of analysing large numbers of individual genomes remain prohibitive for most population genetics studies. Here, we present results based on mathematical derivations showing that, under many realistic experimental designs, NGS of DNA pools from diploid individuals allows to estimate the allele frequencies at single nucleotide polymorphisms (SNPs) with at least the same accuracy as individual‐based analyses, for considerably lower library construction and sequencing efforts. These findings remain true when taking into account the possibility of substantially unequal contributions of each individual to the final pool of sequence reads. We propose the intuitive notion of effective pool size to account for unequal pooling and derive a Bayesian hierarchical model to estimate this parameter directly from the data. We provide a user‐friendly application assessing the accuracy of allele frequency estimation from both pool‐ and individual‐based NGS population data under various sampling, sequencing depth and experimental error designs. We illustrate our findings with theoretical examples and real data sets corresponding to SNP loci obtained using restriction site–associated DNA (RAD) sequencing in pool‐ and individual‐based experiments carried out on the same population of the pine processionary moth (Thaumetopoea pityocampa). NGS of DNA pools might not be optimal for all types of studies but provides a cost‐effective approach for estimating allele frequencies for very large numbers of SNPs. It thus allows comparison of genome‐wide patterns of genetic variation for large numbers of individuals in multiple populations.  相似文献   

9.
Brookmeyer R 《Biometrics》1999,55(2):608-612
The testing of pooled samples of biological specimens for the purpose of estimating disease prevalence may be more cost effective than testing individual samples, particularly if the prevalence of disease is low. Multistage pooling studies involve testing pools and then sequentially subdividing and testing the positive pools. A simple estimator of disease prevalence and its variance are derived for general multistage pooling studies and are shown to be natural generalizations of Thompson's (1962) original estimators for single-stage pooling studies. The reduction in variance associated with each additional stage is calibrated. The results are extended to estimating disease incidence rates. The methods are used to estimate HIV incidence rates from a prevalence study of early HIV infection using a PCR assay for HIV RNA.  相似文献   

10.
Sequencing pools of individuals rather than individuals separately reduces the costs of estimating allele frequencies at many loci in many populations. Theoretical and empirical studies show that sequencing pools comprising a limited number of individuals (typically fewer than 50) provides reliable allele frequency estimates, provided that the DNA pooling and DNA sequencing steps are carefully controlled. Unequal contributions of different individuals to the DNA pool and the mean and variance in sequencing depth both can affect the standard error of allele frequency estimates. To our knowledge, no study separately investigated the effect of these two factors on allele frequency estimates; so that there is currently no method to a priori estimate the relative importance of unequal individual DNA contributions independently of sequencing depth. We develop a new analytical model for allele frequency estimation that explicitly distinguishes these two effects. Our model shows that the DNA pooling variance in a pooled sequencing experiment depends solely on two factors: the number of individuals within the pool and the coefficient of variation of individual DNA contributions to the pool. We present a new method to experimentally estimate this coefficient of variation when planning a pooled sequencing design where samples are either pooled before or after DNA extraction. Using this analytical and experimental framework, we provide guidelines to optimize the design of pooled sequencing experiments. Finally, we sequence replicated pools of inbred lines of the plant Medicago truncatula and show that the predictions from our model generally hold true when estimating the frequency of known multilocus haplotypes using pooled sequencing.  相似文献   

11.
The study of gene functions requires high-quality DNA libraries. However, a large number of tests and screenings are necessary for compiling such libraries. We describe an algorithm for extracting as much information as possible from pooling experiments for library screening. Collections of clones are called pools, and a pooling experiment is a group test for detecting all positive clones. The probability of positiveness for each clone is estimated according to the outcomes of the pooling experiments. Clones with high chance of positiveness are subjected to confirmatory testing. In this paper, we introduce a new positive clone detecting algorithm, called the Bayesian network pool result decoder (BNPD). The performance of BNPD is compared, by simulation, with that of the Markov chain pool result decoder (MCPD) proposed by Knill et al. in 1996. Moreover, the combinatorial properties of pooling designs suitable for the proposed algorithm are discussed in conjunction with combinatorial designs and dhbox{-}{rm disjunct} matrices. We also show the advantage of utilizing packing designs or BIB designs for the BNPD algorithm.  相似文献   

12.
Breen G  Harold D  Ralston S  Shaw D  St Clair D 《BioTechniques》2000,28(3):464-6, 468, 470
Single nucleotide polymorphisms (SNPs) are among the most common types of polymorphism used for genetic association studies. A method to allow the accurate quantitation of their allele frequencies from DNA pools would both increase throughput and decrease costs for large-scale genotyping. However, to date, most DNA pooling studies have concentrated on the use of microsatellite polymorphisms. In the case of SNPs that are restriction fragment length polymorphisms (RFLPs), studies have tended to use methods for the quantitation of allele frequency from pools that rely on densitometric evaluation of bands on an autoradiograph. Radiation-based methods have well-known drawbacks, and we present two alternative methods for the determination of SNP allele frequencies. For RFLPs, we used agarose gel electrophoresis of digested PCR products with ethidium bromide staining combined with densitometric analysis of gel images on a PC. For all types of SNP, we used allele-specific fluorescent probes in the Taqman assay to determine the relative frequencies of two different alleles. Both methods gave accurate and reproducible results, suggesting they are suitable for use in DNA pooling experiments.  相似文献   

13.
Screening large populations for carriers of known or de novo rare single nucleotide polymorphisms (SNPs) is required both in Targeting induced local lesions in genomes (TILLING) experiments in plants and in screening of human populations. We previously suggested an approach that combines the mathematical field of compressed sensing with next‐generation sequencing to allow such large‐scale screening. Based on pooled measurements, this method identifies multiple carriers of heterozygous or homozygous rare alleles while using only a small fraction of resources. Its rigorous mathematical foundations allow scalable and robust detection, and provide error correction and resilience to experimental noise. Here we present a large‐scale experimental demonstration of our computational approach, in which we targeted a TILLING population of 1024 Sorghum bicolor lines to detect carriers of de novo SNPs whose frequency was less than 0.1%, using only 48 pools. Subsequent validation confirmed that all detected lines were indeed carriers of the predicted mutations. This novel approach provides a highly cost‐effective and robust tool for biologists and breeders to allow identification of novel alleles and subsequent functional analysis.  相似文献   

14.
Random mutagenesis and phenotype screening provide a powerful method for dissecting microbial functions, but their results can be laborious to analyze experimentally. Each mutant strain may contain 50-100 random mutations, necessitating extensive functional experiments to determine which one causes the selected phenotype. To solve this problem, we propose a "Phenotype Sequencing" approach in which genes causing the phenotype can be identified directly from sequencing of multiple independent mutants. We developed a new computational analysis method showing that 1. causal genes can be identified with high probability from even a modest number of mutant genomes; 2. costs can be cut many-fold compared with a conventional genome sequencing approach via an optimized strategy of library-pooling (multiple strains per library) and tag-pooling (multiple tagged libraries per sequencing lane). We have performed extensive validation experiments on a set of E. coli mutants with increased isobutanol biofuel tolerance. We generated a range of sequencing experiments varying from 3 to 32 mutant strains, with pooling on 1 to 3 sequencing lanes. Our statistical analysis of these data (4099 mutations from 32 mutant genomes) successfully identified 3 genes (acrB, marC, acrA) that have been independently validated as causing this experimental phenotype. It must be emphasized that our approach reduces mutant sequencing costs enormously. Whereas a conventional genome sequencing experiment would have cost $7,200 in reagents alone, our Phenotype Sequencing design yielded the same information value for only $1200. In fact, our smallest experiments reliably identified acrB and marC at a cost of only $110-$340.  相似文献   

15.
Genome-wide association (GWA) studies to map genes for complex traits are powerful yet costly. DNA-pooling strategies have the potential to dramatically reduce the cost of GWA studies. Pooling using Affymetrix arrays has been proposed and used but the efficiency of these arrays has not been quantified. We compared and contrasted Affymetrix Genechip HindIII and Illumina HumanHap300 arrays on the same DNA pools and showed that the HumanHap300 arrays are substantially more efficient. In terms of effective sample size, HumanHap300-based pooling extracts >80% of the information available with individual genotyping (IG). In contrast, Genechip HindIII-based pooling only extracts ~30% of the available information. With HumanHap300 arrays concordance with IG data is excellent. Guidance is given on best study design and it is shown that even after taking into account pooling error, one stage scans can be performed for >100-fold reduced cost compared with IG. With appropriately designed two stage studies, IG can provide confirmation of pooling results whilst still providing ~20-fold reduction in total cost compared with IG-based alternatives. The large cost savings with Illumina HumanHap300-based pooling imply that future studies need only be limited by the availability of samples and not cost.  相似文献   

16.
Genome-wide genotyping of a cohort using pools rather than individual samples has long been proposed as a cost-saving alternative for performing genome-wide association (GWA) studies. However, successful disease gene mapping using pooled genotyping has thus far been limited to detecting common variants with large effect sizes, which tend not to exist for many complex common diseases or traits. Therefore, for DNA pooling to be a viable strategy for conducting GWA studies, it is important to determine whether commonly used genome-wide SNP array platforms such as the Affymetrix 6.0 array can reliably detect common variants of small effect sizes using pooled DNA. Taking obesity and age at menarche as examples of human complex traits, we assessed the feasibility of genome-wide genotyping of pooled DNA as a single-stage design for phenotype association. By individually genotyping the top associations identified by pooling, we obtained a 14- to 16-fold enrichment of SNPs nominally associated with the phenotype, but we likely missed the top true associations. In addition, we assessed whether genotyping pooled DNA can serve as an inexpensive screen as the second stage of a multi-stage design with a large number of samples by comparing the most cost-effective 3-stage designs with 80% power to detect common variants with genotypic relative risk of 1.1, with and without pooling. Given the current state of the specific technology we employed and the associated genotyping costs, we showed through simulation that a design involving pooling would be 1.07 times more expensive than a design without pooling. Thus, while a significant amount of information exists within the data from pooled DNA, our analysis does not support genotyping pooled DNA as a means to efficiently identify common variants contributing small effects to phenotypes of interest. While our conclusions were based on the specific technology and study design we employed, the approach presented here will be useful for evaluating the utility of other or future genome-wide genotyping platforms in pooled DNA studies.  相似文献   

17.
We have developed a versatile computer program for optimization of ligand binding experiments (e.g., radioreceptor assay system for hormones, drugs, etc.). This optimization algorithm is based on an overall measure of precision of the parameter estimates (D-optimality). The program DESIGN uses an exact mathematical model of the equilibrium ligand binding system with up to two ligands binding to any number of classes of binding sites. The program produces a minimal list of the optimal ligand concentrations for use in the binding experiment. This potentially reduces the time and cost necessary to perform a binding experiment. The program allows comparison of any proposed experimental design with the D-optimal design or with assay protocols in current use. The level of nonspecific binding is regarded as an unknown parameter of the system, along with the affinity constant (Kd) and binding capacity (Bmax). Selected parameters can be fixed at constant values and thereby excluded from the optimization algorithm. Emphasis may be placed on improving the precision of a single parameter or on improving the precision of all the parameters simultaneously. We present optimal designs for several of the more commonly used assay protocols (saturation binding with a single labeled ligand, competition or displacement curve, one or two classes of binding sites), and evaluate the robustness of these designs to changes in parameter values of the underlying models. We also derive the theoretical D-optimal design for the saturation binding experiment with a homogeneous receptor class.  相似文献   

18.
Genetic markers facilitate the study of inheritance and the cloning of genes by genetic approaches. Molecular markers detect differences in DNA sequence, and are thus less ambiguous than phenotypic markers, which require gene expression. We have demonstrated a molecular approach to the mapping of mutant genes using RAPD markers and pooling of individuals based on phenotype. To map genes by phenotypic pooling a strain carrying a mutation is crossed to a strain that is homozygous for the wild-type allele of the corresponding gene. A set of primers corresponding to mapped RAPDs distributed throughout the genome and in coupling phase with respect to the wild type parent is then used to amplify DNA from wild type and mutant pools of F2 individuals. Linkage between the mutant gene and the RAPD markers is visualized by the absence of the corresponding RAPD DNA bands in the mutant pool. We developed a mathematical model for calculating the probability of linkage between RAPDs and target genes and we successfully tested this approach with the model plant Arabidopsis thaliana.  相似文献   

19.
Discovery of rare mutations in populations: TILLING by sequencing   总被引:1,自引:0,他引:1  
Discovery of rare mutations in populations requires methods, such as TILLING (for Targeting Induced Local Lesions in Genomes), for processing and analyzing many individuals in parallel. Previous TILLING protocols employed enzymatic or physical discrimination of heteroduplexed from homoduplexed target DNA. Using mutant populations of rice (Oryza sativa) and wheat (Triticum durum), we developed a method based on Illumina sequencing of target genes amplified from multidimensionally pooled templates representing 768 individuals per experiment. Parallel processing of sequencing libraries was aided by unique tracer sequences and barcodes allowing flexibility in the number and pooling arrangement of targeted genes, species, and pooling scheme. Sequencing reads were processed and aligned to the reference to identify possible single-nucleotide changes, which were then evaluated for frequency, sequencing quality, intersection pattern in pools, and statistical relevance to produce a Bayesian score with an associated confidence threshold. Discovery was robust both in rice and wheat using either bidimensional or tridimensional pooling schemes. The method compared favorably with other molecular and computational approaches, providing high sensitivity and specificity.  相似文献   

20.
Like many viruses, Hepatitis C Virus (HCV) has a high mutation rate, which helps the virus adapt quickly, but mutations come with fitness costs. Fitness costs can be studied by different approaches, such as experimental or frequency-based approaches. The frequency-based approach is particularly useful to estimate in vivo fitness costs, but this approach works best with deep sequencing data from many hosts are. In this study, we applied the frequency-based approach to a large dataset of 195 patients and estimated the fitness costs of mutations at 7957 sites along the HCV genome. We used beta regression and random forest models to better understand how different factors influenced fitness costs. Our results revealed that costs of nonsynonymous mutations were three times higher than those of synonymous mutations, and mutations at nucleotides A or T had higher costs than those at C or G. Genome location had a modest effect, with lower costs for mutations in HVR1 and higher costs for mutations in Core and NS5B. Resistance mutations were, on average, costlier than other mutations. Our results show that in vivo fitness costs of mutations can be site and virus specific, reinforcing the utility of constructing in vivo fitness cost maps of viral genomes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号