首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 500 毫秒
1.
Testing for deviations from Hardy–Weinberg equilibrium (HWE) is a common practice for quality control in genetic studies. Variable sites violating HWE may be identified as technical errors in the sequencing or genotyping process, or they may be of particular evolutionary interest. Large‐scale genetic studies based on next‐generation sequencing (NGS) methods have become more prevalent as cost is decreasing but these methods are still associated with statistical uncertainty. The large‐scale studies usually consist of samples from diverse ancestries that make the existence of some degree of population structure almost inevitable. Precautions are therefore needed when analysing these data set, as population structure causes deviations from HWE. Here we propose a method that takes population structure into account in the testing for HWE, such that other factors causing deviations from HWE can be detected. We show the effectiveness of PCAngsd in low‐depth NGS data, as well as in genotype data, for both simulated and real data set, where the use of genotype likelihoods enables us to model the uncertainty.  相似文献   

2.
The classical χ2‐procedure for the assessment of Hardy–Weinberg equilibrium (HWE) is tailored for detecting violations of HWE. However, many applications in genetic epidemiology require approximate compatibility with HWE. In a previous contribution to the field (Wellek, S. (2004). Biometrics, 60 , 694–703), the methodology of statistical equivalence testing was exploited for the construction of tests for problems in which the assumption of approximate compatibility of a given genotype distribution with HWE plays the role of the alternative hypothesis one aims to establish. In this article, we propose a procedure serving the same purpose but relying on confidence limits rather than critical bounds of a significance test. Interval estimation relates to essentially the same parametric function that was previously chosen as the target parameter for constructing an exact conditional UMPU test for equivalence with a HWE conforming genotype distribution. This population parameter is shown to have a direct genetic interpretation as a measure of relative excess heterozygosity. Confidence limits are constructed using both asymptotic and exact methods. The new approach is illustrated by reanalyzing genotype distributions obtained from published genetic association studies, and detailed guidance for choosing the equivalence margin is provided. The methods have been implemented in freely available SAS macros.  相似文献   

3.
4.
Standard errors for attributable risk for simple and complex sample designs   总被引:1,自引:0,他引:1  
Graubard BI  Fears TR 《Biometrics》2005,61(3):847-855
Adjusted attributable risk (AR) is the proportion of diseased individuals in a population that is due to an exposure. We consider estimates of adjusted AR based on odds ratios from logistic regression to adjust for confounding. Influence function methods used in survey sampling are applied to obtain simple and easily programmable expressions for estimating the variance of AR. These variance estimators can be applied to data from case-control, cross-sectional, and cohort studies with or without frequency or individual matching and for sample designs with subject samples that range from simple random samples to (sample) weighted multistage stratified cluster samples like those used in national household surveys. The variance estimation of AR is illustrated with: (i) a weighted stratified multistage clustered cross-sectional study of childhood asthma from the Third National Health and Examination Survey (NHANES III), and (ii) a frequency-matched case-control study of melanoma skin cancer.  相似文献   

5.
Statistical tests for Hardy–Weinberg equilibrium are important elementary tools in genetic data analysis. X‐chromosomal variants have long been tested by applying autosomal test procedures to females only, and gender is usually not considered when testing autosomal variants for equilibrium. Recently, we proposed specific X‐chromosomal exact test procedures for bi‐allelic variants that include the hemizygous males, as well as autosomal tests that consider gender. In this study, we present the extension of the previous work for variants with multiple alleles. A full enumeration algorithm is used for the exact calculations of tri‐allelic variants. For variants with many alternate alleles, we use a permutation test. Some empirical examples with data from the 1,000 genomes project are discussed.  相似文献   

6.
Deng HW  Chen WM  Recker RR 《Genetics》2001,157(2):885-897
In association studies searching for genes underlying complex traits, the results are often inconsistent, and population admixture has been recognized qualitatively as one major potential cause. Hardy-Weinberg equilibrium (HWE) is often employed to test for population admixture; however, its power is generally unknown. Through analytical and simulation approaches, we quantify the power of the HWE test for population admixture and the effects of population admixture on increasing the type I error rate of association studies under various scenarios of population differentiation and admixture. We found that (1) the power of the HWE test for detecting population admixture is usually small; (2) population admixture seriously elevates type I error rate for detecting genes underlying complex traits, the extent of which depends on the degrees of population differentiation and admixture; (3) HWE testing for population admixture should be performed with random samples or only with controls at the candidate genes, or the test can be performed for combined samples of cases and controls at marker loci that are not linked to the disease; (4) testing HWE for population admixture generally reduces false positive association findings of genes underlying complex traits but the effect is small; and (5) with population admixture, a linkage disequilibrium method that employs cases only is more robust and yields many fewer false positive findings than conventional case-control analyses. Therefore, unless random samples are carefully selected from one homogeneous population, admixture is always a legitimate concern for positive findings in association studies except for the analyses that deliberately control population admixture.  相似文献   

7.
Many molecular ecology analyses assume the genotyped individuals are sampled at random from a population and thus are representative of the population. Realistically, however, a sample may contain excessive close relatives (ECR) because, for example, localized juveniles are drawn from fecund species. Our knowledge is limited about how ECR affect the routinely conducted elementary genetics analyses, and how ECR are best dealt with to yield unbiased and accurate parameter estimates. This study quantifies the effects of ECR on some popular population genetics analyses of marker data, including the estimation of allele frequencies, F‐statistics, expected heterozygosity (He), effective and observed numbers of alleles, and the tests of Hardy–Weinberg equilibrium (HWE) and linkage equilibrium (LE). It also investigates several strategies for handling ECR to mitigate their impact and to yield accurate parameter estimates. My analytical work, assisted by simulations, shows that ECR have large and global effects on all of the above marker analyses. The naïve approach of simply ignoring ECR could yield low‐precision and often biased parameter estimates, and could cause too many false rejections of HWE and LE. The bold approach, which simply identifies and removes ECR, and the cautious approach, which estimates target parameters (e.g., He) by accounting for ECR and using naïve allele frequency estimates, eliminate the bias and the false HWE and LE rejections, but could reduce estimation precision substantially. The likelihood approach, which accounts for ECR in estimating allele frequencies and thus target parameters relying on allele frequencies, usually yields unbiased and the most accurate parameter estimates. Which of the four approaches is the most effective and efficient may depend on the particular marker analysis to be conducted. The results are discussed in the context of using marker data for understanding population properties and marker properties.  相似文献   

8.
Studies of genetic population structure often involve numerous tests of Hardy–Weinberg equilibrium (HWE), linkage disequilibrium (LD) and genetic differentiation. Tests of HWE or LD are important precursors to population structure assessments. When conducting multiple related statistical tests, type I error increases, e.g., familywise error rate (FWER) inflation. FWER inflation can alter the results of statistical tests and thus the conclusions. We surveyed literature from 2011 to 2013 to determine if studies of population structure assess LD and HWE and if FWER corrections were applied consistently across different types of tests. We found significantly inconsistent FWER corrections, with a bias towards less restrictive correction on genetic differentiation and more restrictive corrections with LD and HWE. While varied adjustments of FWER for different types of analyses might be justified, papers with inconsistent usage across tests of HWE, LD and genetic differentiation did not present rationale for their FWER corrections. We also found a lack of documentation of HWE, LD and FWER corrections in studies. We encourage the authors to report statistical tests and related FWER corrections, use FWER corrections consistently or justify their different methods in the same study.  相似文献   

9.
Population geneticists often use multiple independent hypothesis tests of Hardy–Weinberg Equilibrium (HWE), Linkage Disequilibrium (LD), and population differentiation, to make broad inferences about their systems of choice. However, correcting for Family‐Wise Error Rates (FWER) that are inflated due to multiple comparisons, is sparingly reported in our current literature. In this issue of Molecular Ecology Resources, perform a meta‐analysis of 215 population genetics studies published between 2011 and 2013 to show (i) scarce use of FWER corrections across all three classes of tests, and (ii) when used, inconsistent application of correction methods with a clear bias towards less‐conservative corrections for tests of population differentiation, than for tests of HWE, and LD. Here we replicate this meta‐analysis using 205 population genetics studies published between 2013 and 2018, to show the same continued disuse, and inconsistencies. We hope that both studies serve as a wake‐up call to population geneticists, reviewers, and editors to be rigorous about consistently correcting for FWER inflation.  相似文献   

10.
Deviations from Hardy–Weinberg expectations are frequently a sign of genotyping error. hw‐quickcheck is an easy‐to‐use computer program for detecting departures from Hardy–Weinberg equilibrium. hw‐quickcheck uses exact tests for all of its calculations. These tests include a global test for heterozygote excess/deficiency and genotype‐specific tests.  相似文献   

11.
Deviations from Hardy-Weinberg equilibrium (HWE) can indicate inbreeding, population stratification, and even problems in genotyping. In samples of affected individuals, these deviations can also provide evidence for association. Tests of HWE are commonly performed using a simple chi2 goodness-of-fit test. We show that this chi2 test can have inflated type I error rates, even in relatively large samples (e.g., samples of 1,000 individuals that include approximately 100 copies of the minor allele). On the basis of previous work, we describe exact tests of HWE together with efficient computational methods for their implementation. Our methods adequately control type I error in large and small samples and are computationally efficient. They have been implemented in freely available code that will be useful for quality assessment of genotype data and for the detection of genetic association or population stratification in very large data sets.  相似文献   

12.
This study explores socio-economic gradients in height (stature-for-age) among a nationally representative sample of 2–6 year old children in the United States. We use NHANES III (1988–1994) Youth data linked with a special Natality Data supplement which contains information from birth certificates among sampled NHANES III Youth who are <7 years of age. Our results indicate significant socio-economic gradients for both maternal education and family income, net of controls for confounders, including: birth weight, gestational age, family size, and parental heights. These results are in stark contrast to those from other developed countries that seem to indicate diminished or eliminated socio-economic disparities, net of known confounders. In the United States, it appears that socio-economic gradients have an effect on birth outcomes, and continue to have an additional direct and independent effect on height, even in early childhood.  相似文献   

13.
Case‐control studies are primary study designs used in genetic association studies. Sasieni (Biometrics 1997, 53, 1253–1261) pointed out that the allelic chi‐square test used in genetic association studies is invalid when Hardy‐Weinberg equilibrium (HWE) is violated in a combined population. It is important to know how much type I error rate is deviated from the nominal level under violated HWE. We examine bounds of type I error rate of the allelic chi‐square test. We also investigate power of the goodness‐of‐fit test for HWE which can be used as a guideline for selecting an appropriate test between the allelic chi‐square test and the modified allelic chi‐square test, the latter of which was proposed for cases of violated HWE. In small samples, power is not large enough to detect the Wright's inbreeding model of small values of inbreeding coefficient. Therefore, when the null hypothesis of HWE is barely accepted, the modified test should be considered as an alternative method. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

14.
Recently, there have been many case-control studies proposed to test for association between haplotypes and disease, which require the Hardy-Weinberg equilibrium (HWE) assumption of haplotype frequencies. As such, haplotype inference of unphased genotypes and development of haplotype-based HWE tests are crucial prior to fine mapping. The goodness-of-fit test is a frequently-used method to test for HWE for multiple tightly-linked loci. However, its degrees of freedom dramatically increase with the increase of the number of loci, which may lack the test power. Therefore, in this paper, to improve the test power for haplotype-based HWE, we first write out two likelihood functions of the observed data based on the Niu''s model (NM) and inbreeding model (IM), respectively, which can cause the departure from HWE. Then, we use two expectation-maximization algorithms and one expectation-conditional-maximization algorithm to estimate the model parameters under the HWE, IM and NM models, respectively. Finally, we propose the likelihood ratio tests LRT and LRT for haplotype-based HWE under the NM and IM models, respectively. We simulate the HWE, Niu''s, inbreeding and population stratification models to assess the validity and compare the performance of these two LRT tests. The simulation results show that both of the tests control the type I error rates well in testing for haplotype-based HWE. If the NM model is true, then LRT is more powerful. While, if the true model is the IM model, then LRT has better performance in power. Under the population stratification model, LRT is still more powerful. To this end, LRT is generally recommended. Application of the proposed methods to a rheumatoid arthritis data set further illustrates their utility for real data analysis.  相似文献   

15.
Hardy-Weinberg equilibrium (HWE) is a useful indicator of genotype frequencies within a population and whether they are based on a valid definition of alleles and a randomly mating sample. HWE assumes a stable population of adequate size without selective pressures and is used in human genetic studies as a guide to data quality by comparing observed genotype frequencies to those expected within a population. The calculation of genetic associations in case-control studies assume that the population is "in HWE." Canine breed populations deviate away from many of the criteria for HWE, and if genetic markers are not in HWE, conventional statistical analysis cannot be performed. To date, little attention has been paid as to whether genetic markers in dog breeds are distributed in compliance to HWE. In this study, 109 single-nucleotide polymorphisms (SNPs) were genotyped from 13 genes in a cohort of 894 dogs encompassing 33 breeds. Analysis of the entire cohort of dogs revealed a significant deviation away from HWE for all SNPs tested (P < 0.00001); analysis of the cohort stratified by breed and subbreed indicated that the majority of the markers complied with HWE expectation. This suggests that canine case-control association studies will be valid if performed within defined breeds.  相似文献   

16.
It is widely acknowledged that genome-wide association studies (GWAS) of complex human disease fail to explain a large portion of heritability, primarily due to lack of statistical power—a problem that is exacerbated when seeking detection of interactions of multiple genomic loci. An untapped source of information that is already widely available, and that is expected to grow in coming years, is population samples. Such samples contain genetic marker data for additional individuals, but not their relevant phenotypes. In this article we develop a highly efficient testing framework based on a constrained maximum-likelihood estimate in a case–control–population setting. We leverage the available population data and optional modeling assumptions, such as Hardy–Weinberg equilibrium (HWE) in the population and linkage equilibrium (LE) between distal loci, to substantially improve power of association and interaction tests. We demonstrate, via simulation and application to actual GWAS data sets, that our approach is substantially more powerful and robust than standard testing approaches that ignore or make naive use of the population sample. We report several novel and credible pairwise interactions, in bipolar disorder, coronary artery disease, Crohn’s disease, and rheumatoid arthritis.  相似文献   

17.
Chen H  Stasny EA  Wolfe DA 《Biometrics》2006,62(1):150-158
The application of ranked set sampling (RSS) techniques to data from a dichotomous population is currently an active research topic, and it has been shown that balanced RSS leads to improvement in precision over simple random sampling (SRS) for estimation of a population proportion. Balanced RSS, however, is not in general optimal in terms of variance reduction for this setting. The objective of this article is to investigate the application of unbalanced RSS in estimation of a population proportion under perfect ranking, where the probabilities of success for the order statistics are functions of the underlying population proportion. In particular, the Neyman allocation, which assigns sample units for each order statistic proportionally to its standard deviation, is shown to be optimal in the sense that it leads to minimum variance within the class of RSS estimators that are simple averages of the means of the order statistics. We also use a substantial data set, the National Health and Nutrition Examination Survey III (NHANES III) data, to demonstrate the feasibility and benefits of Neyman allocation in RSS for binary variables.  相似文献   

18.
Much forensic inference based upon DNA evidence is made assuming Hardy-Weinberg Equilibrium (HWE) for the genetic loci being used. Several statistical tests to detect and measure deviation from HWE have been devised, and their limitations become more obvious when testing for deviation within multiallelic DNA loci. The most popular methods-Chi-square and Likelihood-ratio tests-are based on asymptotic results and cannot guarantee a good performance in the presence of low frequency genotypes. Since the parameter space dimension increases at a quadratic rate on the number of alleles, some authors suggest applying sequential methods, where the multiallelic case is reformulated as a sequence of "biallelic" tests. However, in this approach it is not obvious how to assess the general evidence of the original hypothesis; nor is it clear how to establish the significance level for its acceptance/rejection. In this work, we introduce a straightforward method for the multiallelic HWE test, which overcomes the aforementioned issues of sequential methods. The core theory for the proposed method is given by the Full Bayesian Significance Test (FBST), an intuitive Bayesian approach which does not assign positive probabilities to zero measure sets when testing sharp hypotheses. We compare FBST performance to Chi-square, Likelihood-ratio and Markov chain tests, in three numerical experiments. The results suggest that FBST is a robust and high performance method for the HWE test, even in the presence of several alleles and small sample sizes.  相似文献   

19.
We propose a new technique for the exact test of Hardy‐Weinberg proportion that considerably extends the bounds of computational feasibility. Our algorithm is constructed analogously to the network algorithm for Freeman‐Halton exact test in two‐way contingency tables. In this algorithm, the smallest and the largest values for the statistic are important and some interesting new theorems are proved for computing these values. Numerical examples are given to illustrate the practicality of our algorithm.  相似文献   

20.
J W Choi  R B McHugh 《Biometrics》1989,45(3):979-996
Situations often arise in a large-scale household survey where a complex probability sample of clusters rather than of individuals is drawn from a large population. Typically, the clusters of such complex samples include a number of correlated members. The responses of these members are then weighted to obtain estimates for the population. Such weighted data are commonly published by the National Center for Health Statistics and other U.S. federal agencies. Frequently, problems arise when such data are tested by usual chi-square test statistics for goodness of fit or independence. Researchers have discovered that the usual chi-square tests provide spuriously inflated results when applied to cluster samples and that new methods are required to correct such problems. This paper proposes a strategy for a goodness-of-fit or independence test based on correlated and weighted data arising in cluster samples, and provides a factor that validly reduces the inflation of the usual chi-square statistics. This method is applied to the chronic condition data collected from the St Paul-Minneapolis, Minnesota, primary sampling unit (PSU) during the 1975 National Health Interview Survey (NHIS). This analysis, together with simulation studies presented elsewhere, provides evidence that the usual chi-square statistics from such data can be corrected for the impacts of clustering and weighting by use of the proposed reduction factor.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号