首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The purpose of this work is to quantify the effects that errors in genotyping have on power and the sample size necessary to maintain constant asymptotic Type I and Type II error rates (SSN) for case-control genetic association studies between a disease phenotype and a di-allelic marker locus, for example a single nucleotide polymorphism (SNP) locus. We consider the effects of three published models of genotyping errors on the chi-square test for independence in the 2 x 3 table. After specifying genotype frequencies for the marker locus conditional on disease status and error model in both a genetic model-based and a genetic model-free framework, we compute the asymptotic power to detect association through specification of the test's non-centrality parameter. This parameter determines the functional dependence of SSN on the genotyping error rates. Additionally, we study the dependence of SSN on linkage disequilibrium (LD), marker allele frequencies, and genotyping error rates for a dominant disease model. Increased genotyping error rate requires a larger SSN. Every 1% increase in sum of genotyping error rates requires that both case and control SSN be increased by 2-8%, with the extent of increase dependent upon the error model. For the dominant disease model, SSN is a nonlinear function of LD and genotyping error rate, with greater SSN for lower LD and higher genotyping error rate. The combination of lower LD and higher genotyping error rates requires a larger SSN than the sum of the SSN for the lower LD and for the higher genotyping error rate.  相似文献   

2.
3.
OBJECTIVE: In affected sib pair studies without genotyped parents the effect of genotyping error is generally to reduce the type I error rate and power of tests for linkage. The effect of genotyping error when parents have been genotyped is unknown. We investigated the type I error rate of the single-point Mean test for studies in which genotypes of both parents are available. METHODS: Datasets were simulated assuming no linkage and one of five models for genotyping error. In each dataset, Mendelian-inconsistent families were either excluded or regenotyped, and then the Mean test applied. RESULTS: We found that genotyping errors lead to an inflated type I error rate when inconsistent families are excluded. Depending on the genotyping-error model assumed, regenotyping inconsistent families has one of several effects. It may produce the same type I error rate as if inconsistent families are excluded; it may reduce the type I error, but still leave an anti-conservative test; or it may give a conservative test. Departures of the type I error rate from its nominal level increase with both the genotyping error rate and sample size. CONCLUSION: We recommend that markers with high error rates either be excluded from the analysis or be regenotyped in all families.  相似文献   

4.
Cox DG  Kraft P 《Human heredity》2006,61(1):10-14
Deviation from Hardy-Weinberg equilibrium has become an accepted test for genotyping error. While it is generally considered that testing departures from Hardy-Weinberg equilibrium to detect genotyping error is not sensitive, little has been done to quantify this sensitivity. Therefore, we have examined various models of genotyping error, including error caused by neighboring SNPs that degrade the performance of genotyping assays. We then calculated the power of chi-square goodness-of-fit tests for deviation from Hardy-Weinberg equilibrium to detect such error. We have also examined the affects of neighboring SNPs on risk estimates in the setting of case-control association studies. We modeled the power of departure from Hardy-Weinberg equilibrium as a test to detect genotyping error and quantified the effect of genotyping error on disease risk estimates. Generally, genotyping error does not generate sufficient deviation from Hardy-Weinberg equilibrium to be detected. As expected, genotyping error due to neighboring SNPs attenuates risk estimates, often drastically. For the moment, the most widely accepted method of detecting genotyping error is to confirm genotypes by sequencing and/or genotyping via a separate method. While these methods are fairly reliable, they are also costly and time consuming.  相似文献   

5.
Hao K  Li C  Rosenow C  Hung Wong W 《Genomics》2004,84(4):623-630
Currently, most analytical methods assume all observed genotypes are correct; however, it is clear that errors may reduce statistical power or bias inference in genetic studies. We propose procedures for estimating error rate in genetic analysis and apply them to study the GeneChip Mapping 10K array, which is a technology that has recently become available and allows researchers to survey over 10,000 SNPs in a single assay. We employed a strategy to estimate the genotype error rate in pedigree data. First, the "dose-response" reference curve between error rate and the observable error number were derived by simulation, conditional on given pedigree structures and genotypes. Second, the error rate was estimated by calibrating the number of observed errors in real data to the reference curve. We evaluated the performance of this method by simulation study and applied it to a data set of 30 pedigrees genotyped using the GeneChip Mapping 10K array. This method performed favorably in all scenarios we surveyed. The dose-response reference curve was monotone and almost linear with a large slope. The method was able to estimate accurately the error rate under various pedigree structures and error models and under heterogeneous error rates. Using this method, we found that the average genotyping error rate of the GeneChip Mapping 10K array was about 0.1%. Our method provides a quick and unbiased solution to address the genotype error rate in pedigree data. It behaves well in a wide range of settings and can be easily applied in other genetic projects. The robust estimation of genotyping error rate allows us to estimate power and sample size and conduct unbiased genetic tests. The GeneChip Mapping 10K array has a low overall error rate, which is consistent with the results obtained from alternative genotyping assays.  相似文献   

6.
The net benefit from investing in any technology is a function of the cost of implementation and the expected return in revenue. The objective of the present study was to quantify, using deterministic equations, the net monetary benefit from investing in genotyping of commercial females. Three case studies were presented reflecting dairy cows, beef cows and ewes based on Irish population parameters; sensitivity analyses were also performed. Parameters considered in the sensitivity analyses included the accuracy of genomic evaluations, replacement rate, proportion of female selection candidates retained as replacements, the cost of genotyping, the sire parentage error rate and the age of the female when it first gave birth. Results were presented as an annualised monetary net benefit over the lifetime of an individual, after discounting for the timing of expressions. In the base scenarios, the net benefit was greatest for dairy, followed by beef and then sheep. The net benefit improved as the reliability of the genomic evaluations improved and, in fact, a negative net benefit of genotyping was less frequent when the reliability of the genomic evaluations was high. The impact of a 10% point increase in genomic reliability was, however, greatest in sheep, followed by beef and then dairy. The net benefit of genotyping female selection candidates reduced as replacement rate increased. As genotyping costs increased, the net benefit reduced irrespective of the percentage of selection candidates kept, the replacement rate or even the population considered. Nonetheless, the association between the genotyping cost and the net benefit of genotyping differed by the percentage of selection candidates kept. Across all replacement rates evaluated, retaining 25% of the selection candidates resulted in the greatest net benefit when genotyping cost was low but the lowest net benefit when genotyping cost was high. Genotyping breakeven cost was non-linearly associated with the percentage of selection candidates retained, reaching a maximum when 50% of selection candidates were retained, irrespective of replacement rate, genomic reliability or the population. The genotyping breakeven cost was also non-linearly associated with replacement rate. The approaches outlined within provide the back-end framework for a decision support tool to quantify the net benefit of genotyping, once parameterised by the relevant population metrics.  相似文献   

7.
Conservation and population genetic studies are sometimes hampered by insufficient quantities of high quality DNA. One potential way to overcome this problem is through the use of whole genome amplification (WGA) kits. We performed rolling circle WGA on DNA obtained from matched hair and tissue samples of North American red squirrels (Tamiasciurus hudsonicus). Following polymerase chain reaction (PCR) at four microsatellite loci, we compared genotyping success for DNA from different source tissues, both pre‐ and post‐WGA. Genotypes obtained with tissue were robust, whether or not DNA had been subjected to WGA. DNA extracted from hair produced results that were largely concordant with matched tissue samples, although amplification success was reduced and some allelic dropout was observed. WGA of hair samples resulted in a low genotyping success rate and an unacceptably high rate of allelic dropout and genotyping error. The problem was not rectified by conducting PCR of WGA hair samples in triplicate. Therefore, we conclude that WGA is only an effective method of enhancing template DNA quantity when the initial sample is from high‐yield material.  相似文献   

8.
Errors while genotyping are inevitable and can reduce the power to detect linkage. However, does genotyping error have the same impact on linkage results for single-nucleotide polymorphism (SNP) and microsatellite (MS) marker maps? To evaluate this question we detected genotyping errors that are consistent with Mendelian inheritance using large changes in multipoint identity-by-descent sharing in neighboring markers. Only a small fraction of Mendelian consistent errors were detectable (e.g., 18% of MS and 2.4% of SNP genotyping errors). More SNP genotyping errors are Mendelian consistent compared to MS genotyping errors, so genotyping error may have a greater impact on linkage results using SNP marker maps. We also evaluated the effect of genotyping error on the power and type I error rate using simulated nuclear families with missing parents under 0, 0.14, and 2.8% genotyping error rates. In the presence of genotyping error, we found that the power to detect a true linkage signal was greater for SNP (75%) than MS (67%) marker maps, although there were also slightly more false-positive signals using SNP marker maps (5 compared with 3 for MS). Finally, we evaluated the usefulness of accounting for genotyping error in the SNP data using a likelihood-based approach, which restores some of the power that is lost when genotyping error is introduced.  相似文献   

9.
Genotyping errors are present in almost all genetic data and can affect biological conclusions of a study, particularly for studies based on individual identification and parentage. Many statistical approaches can incorporate genotyping errors, but usually need accurate estimates of error rates. Here, we used a new microsatellite data set developed for brown rockfish (Sebastes auriculatus) to estimate genotyping error using three approaches: (i) repeat genotyping 5% of samples, (ii) comparing unintentionally recaptured individuals and (iii) Mendelian inheritance error checking for known parent–offspring pairs. In each data set, we quantified genotyping error rate per allele due to allele drop‐out and false alleles. Genotyping error rate per locus revealed an average overall genotyping error rate by direct count of 0.3%, 1.5% and 1.7% (0.002, 0.007 and 0.008 per allele error rate) from replicate genotypes, known parent–offspring pairs and unintentionally recaptured individuals, respectively. By direct‐count error estimates, the recapture and known parent–offspring data sets revealed an error rate four times greater than estimated using repeat genotypes. There was no evidence of correlation between error rates and locus variability for all three data sets, and errors appeared to occur randomly over loci in the repeat genotypes, but not in recaptures and parent–offspring comparisons. Furthermore, there was no correlation in locus‐specific error rates between any two of the three data sets. Our data suggest that repeat genotyping may underestimate true error rates and may not estimate locus‐specific error rates accurately. We therefore suggest using methods for error estimation that correspond to the overall aim of the study (e.g. known parent–offspring comparisons in parentage studies).  相似文献   

10.
In non‐model organisms, evolutionary questions are frequently addressed using reduced representation sequencing techniques due to their low cost, ease of use, and because they do not require genomic resources such as a reference genome. However, evidence is accumulating that such techniques may be affected by specific biases, questioning the accuracy of obtained genotypes, and as a consequence, their usefulness in evolutionary studies. Here, we introduce three strategies to estimate genotyping error rates from such data: through the comparison to high quality genotypes obtained with a different technique, from individual replicates, or from a population sample when assuming Hardy‐Weinberg equilibrium. Applying these strategies to data obtained with Restriction site Associated DNA sequencing (RAD‐seq), arguably the most popular reduced representation sequencing technique, revealed per‐allele genotyping error rates that were much higher than sequencing error rates, particularly at heterozygous sites that were wrongly inferred as homozygous. As we exemplify through the inference of genome‐wide and local ancestry of well characterized hybrids of two Eurasian poplar (Populus) species, such high error rates may lead to wrong biological conclusions. By properly accounting for these error rates in downstream analyses, either by incorporating genotyping errors directly or by recalibrating genotype likelihoods, we were nevertheless able to use the RAD‐seq data to support biologically meaningful and robust inferences of ancestry among Populus hybrids. Based on these findings, we strongly recommend carefully assessing genotyping error rates in reduced representation sequencing experiments, and to properly account for these in downstream analyses, for instance using the tools presented here.  相似文献   

11.
He Y  Li C  Amos CI  Xiong M  Ling H  Jin L 《PloS one》2011,6(7):e22097
The genome-wide association study (GWAS) has become a routine approach for mapping disease risk loci with the advent of large-scale genotyping technologies. Multi-allelic haplotype markers can provide superior power compared with single-SNP markers in mapping disease loci. However, the application of haplotype-based analysis to GWAS is usually bottlenecked by prohibitive time cost for haplotype inference, also known as phasing. In this study, we developed an efficient approach to haplotype-based analysis in GWAS. By using a reference panel, our method accelerated the phasing process and reduced the potential bias generated by unrealistic assumptions in phasing process. The haplotype-based approach delivers great power and no type I error inflation for association studies. With only a medium-size reference panel, phasing error in our method is comparable to the genotyping error afforded by commercial genotyping solutions.  相似文献   

12.
There has been remarkably little attention to using the high resolution provided by genotyping‐by‐sequencing (i.e., RADseq and similar methods) for assessing relatedness in wildlife populations. A major hurdle is the genotyping error, especially allelic dropout, often found in this type of data that could lead to downward‐biased, yet precise, estimates of relatedness. Here, we assess the applicability of genotyping‐by‐sequencing for relatedness inferences given its relatively high genotyping error rate. Individuals of known relatedness were simulated under genotyping error, allelic dropout and missing data scenarios based on an empirical ddRAD data set, and their true relatedness was compared to that estimated by seven relatedness estimators. We found that an estimator chosen through such analyses can circumvent the influence of genotyping error, with the estimator of Ritland (Genetics Research, 67, 175) shown to be unaffected by allelic dropout and to be the most accurate when there is genotyping error. We also found that the choice of estimator should not rely solely on the strength of correlation between estimated and true relatedness as a strong correlation does not necessarily mean estimates are close to true relatedness. We also demonstrated how even a large SNP data set with genotyping error (allelic dropout or otherwise) or missing data still performs better than a perfectly genotyped microsatellite data set of tens of markers. The simulation‐based approach used here can be easily implemented by others on their own genotyping‐by‐sequencing data sets to confirm the most appropriate and powerful estimator for their data.  相似文献   

13.
Cheng KF  Chen JH 《Human heredity》2007,64(2):114-122
The transmission/disequilibrium test (TDT), a family based test of linkage and association, is a popular test for studies of complex inheritance, as it is nonparametric and robust against spurious conclusions induced by hidden genetic structure, such as stratification or admixture. However, the TDT may be biased by genotyping errors. Undetected genotyping errors may be contributing to an inflated type I error rate among reported TDT-derived associations. To adjust for bias, a popular approach is to assume a genotype error model for describing the pattern of errors and propose association tests using likelihood method. However, all model-based approaches tend to perform unsatisfactorily if the related genotyping error rates are not identical across all families. In this paper, we propose a TDT-type association test which is not only simple, robust against population stratification (and hence the assumption of Hardy-Weinberg equilibrium is not required), but also robust against genotyping error with error rates varying across families. Simulation studies confirm that the new test has very reasonable performance.  相似文献   

14.
Despite much discussion of the importance of quantifying and reporting genotyping error in molecular studies, it is still not standard practice in the literature. This is particularly a concern for amplified fragment length polymorphism (AFLP) studies, where differences in laboratory, peak‐calling and locus‐selection protocols can generate data sets varying widely in genotyping error rate, the number of loci used and potentially estimates of genetic diversity or differentiation. In our experience, papers rarely provide adequate information on AFLP reproducibility, making meaningful comparisons among studies difficult. To quantify the extent of this problem, we reviewed the current molecular ecology literature (470 recent AFLP articles) to determine the proportion of studies that report an error rate and follow established guidelines for assessing error. Fifty‐four per cent of recent articles do not report any assessment of data set reproducibility. Of those studies that do claim to have assessed reproducibility, the majority (~90%) either do not report a specific error rate or do not provide sufficient details to allow the reader to judge whether error was assessed correctly. Even of the papers that do report an error rate and provide details, many (≥23%) do not follow recommended standards for quantifying error. These issues also exist for other marker types such as microsatellites, and next‐generation sequencing techniques, particularly those which use restriction enzymes for fragment generation. Therefore, we urge all researchers conducting genotyping studies to estimate and more transparently report genotyping error using existing guidelines and encourage journals to enforce stricter standards for the publication of genotyping studies.  相似文献   

15.
Moskvina V  Schmidt KM 《Biometrics》2006,62(4):1116-1123
With the availability of fast genotyping methods and genomic databases, the search for statistical association of single nucleotide polymorphisms with a complex trait has become an important methodology in medical genetics. However, even fairly rare errors occurring during the genotyping process can lead to spurious association results and decrease in statistical power. We develop a systematic approach to study how genotyping errors change the genotype distribution in a sample. The general M-marker case is reduced to that of a single-marker locus by recognizing the underlying tensor-product structure of the error matrix. Both method and general conclusions apply to the general error model; we give detailed results for allele-based errors of size depending both on the marker locus and the allele present. Multiple errors are treated in terms of the associated diffusion process on the space of genotype distributions. We find that certain genotype and haplotype distributions remain unchanged under genotyping errors, and that genotyping errors generally render the distribution more similar to the stable one. In case-control association studies, this will lead to loss of statistical power for nondifferential genotyping errors and increase in type I error for differential genotyping errors. Moreover, we show that allele-based genotyping errors do not disturb Hardy-Weinberg equilibrium in the genotype distribution. In this setting we also identify maximally affected distributions. As they correspond to situations with rare alleles and marker loci in high linkage disequilibrium, careful checking for genotyping errors is advisable when significant association based on such alleles/haplotypes is observed in association studies.  相似文献   

16.
A study including eight microsatellite loci for 1,014 trees from seven mapped stands of the partially clonal Populus euphratica was used to demonstrate how genotyping errors influence estimates of clonality. With a threshold of 0 (identical multilocus genotypes constitute one clone) we identified 602 genotypes. A threshold of 1 (compensating for an error in one allele) lowered this number to 563. Genotyping errors can seemingly merge (type 1 error), split really existing clones (type 2), or convert a unique genotype into another unique genotype (type 3). We used context information (sex and spatial position) to estimate the type 1 error. For thresholds of 0 and 1 the estimate was below 0.021, suggesting a high resolution for the marker system. The rate of genotyping errors was estimated by repeated genotyping for a cohort of 41 trees drawn at random (0.158), and a second cohort of 40 trees deviating in one allele from another tree (0.368). For the latter cohort, most of these deviations turned out to be errors, but 8 out of 602 obtained multilocus genotypes may represent somatic mutations, corresponding to a mutation rate of 0.013. A simulation of genotyping errors for populations with varying clonality and evenness showed the number of genotypes always to be overestimated for a system with high resolution, and this mistake increases with increasing clonality and evenness. Allowing a threshold of 1 compensates for most genotyping errors and leads to much more precise estimates of clonality compared with a threshold of 0. This lowers the resolution of the marker system, but comparison with context information can help to check if the resolution is sufficient to apply a higher threshold. We recommend simulation procedures to investigate the behavior of a marker system for different thresholds and error rates to obtain the best estimate of clonality.  相似文献   

17.
DNA extracted from hair or faeces shows increasing promise for censusing populations whose individuals are difficult to locate. To date, the main problem with this approach has been that genotyping errors are common. If these errors are not identified, counting genotypes is likely to overestimate the number of individuals in a population. Here, we describe an algorithm that uses maximum likelihood estimates of genotyping error rates to calculate the evidence that samples came from the same individual. We test this algorithm with a hypothetical model of genotyping error and show that this algorithm works well with substantial rates of genotyping error and reasonable amounts of data. Additional work is necessary to develop statistical models of error in empirical data.  相似文献   

18.
In noninvasive genetic sampling, when genotyping error rates are high and recapture rates are low, misidentification of individuals can lead to overestimation of population size. Thus, estimating genotyping errors is imperative. Nonetheless, conducting multiple polymerase chain reactions (PCRs) at multiple loci is time-consuming and costly. To address the controversy regarding the minimum number of PCRs required for obtaining a consensus genotype, we compared consumer-style the performance of two genotyping protocols (multiple-tubes and 'comparative method') in respect to genotyping success and error rates. Our results from 48 faecal samples of river otters (Lontra canadensis) collected in Wyoming in 2003, and from blood samples of five captive river otters amplified with four different primers, suggest that use of the comparative genotyping protocol can minimize the number of PCRs per locus. For all but five samples at one locus, the same consensus genotypes were reached with fewer PCRs and with reduced error rates with this protocol compared to the multiple-tubes method. This finding is reassuring because genotyping errors can occur at relatively high rates even in tissues such as blood and hair. In addition, we found that loci that amplify readily and yield consensus genotypes, may still exhibit high error rates (7-32%) and that amplification with different primers resulted in different types and rates of error. Thus, assigning a genotype based on a single PCR for several loci could result in misidentification of individuals. We recommend that programs designed to statistically assign consensus genotypes should be modified to allow the different treatment of heterozygotes and homozygotes intrinsic to the comparative method.  相似文献   

19.
The genotyping of mother–father–child trios is a very useful tool in disease association studies, as trios eliminate population stratification effects and increase the accuracy of haplotype inference. Unfortunately, the use of trios for association studies may reduce power, since it requires the genotyping of three individuals where only four independent haplotypes are involved. We describe here a method for genotyping a trio using two DNA pools, thus reducing the cost of genotyping trios to that of genotyping two individuals. Furthermore, we present extensions to the method that exploit the linkage disequilibrium structure to compensate for missing data and genotyping errors. We evaluated our method on trios from CEPH pedigree 66 of the Coriell Institute. We demonstrate that the error rates in the genotype calls of the proposed protocol are comparable to those of standard genotyping techniques, although the cost is reduced considerably. The approach described is generic and it can be applied to any genotyping platform that achieves a reasonable precision of allele frequency estimates from pools of two individuals. Using this approach, future trio-based association studies may be able to increase the sample size by 50% for the same cost and thereby increase the power to detect associations.  相似文献   

20.
Microsatellite genotyping is a common DNA characterization technique in population, ecological and evolutionary genetics research. Since different alleles are sized relative to internal size-standards, different laboratories must calibrate and standardize allelic designations when exchanging data. This interchange of microsatellite data can often prove problematic. Here, 16 microsatellite loci were calibrated and standardized for the Atlantic salmon, Salmo salar, across 12 laboratories. Although inconsistencies were observed, particularly due to differences between migration of DNA fragments and actual allelic size ('size shifts'), inter-laboratory calibration was successful. Standardization also allowed an assessment of the degree and partitioning of genotyping error. Notably, the global allelic error rate was reduced from 0.05 ± 0.01 prior to calibration to 0.01 ± 0.002 post-calibration. Most errors were found to occur during analysis (i.e. when size-calling alleles; the mean proportion of all errors that were analytical errors across loci was 0.58 after calibration). No evidence was found of an association between the degree of error and allelic size range of a locus, number of alleles, nor repeat type, nor was there evidence that genotyping errors were more prevalent when a laboratory analyzed samples outside of the usual geographic area they encounter. The microsatellite calibration between laboratories presented here will be especially important for genetic assignment of marine-caught Atlantic salmon, enabling analysis of marine mortality, a major factor in the observed declines of this highly valued species.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号